Jiang, Ziping (2025) Interpretability of Feed Forward Neural Network from Activation Perspective. PhD thesis, Lancaster University.
Abstract
In the past decade, deep neural networks have presented promising results in various fields. To advance their success and to mitigate the limitation of opaqueness, this dissertation explores the explainability of feedforward neural networks from the activation function perspective. The theoretical outcome of this work is a framework for analyzing the neural network that was developed on the basis of existing literatures. This framework generalizes the definition of activation pattern by indexing neurons with an index family and releasing the constraint on activation functions. Based on this framework, this research identifies a novel dying neuron issue that prevents networks from reaching their optimum by studying the learning dynamic of neural networks. To further understand the dying neuron issue, two metrics are proposed to explore expressive ability of models. The pattern similarity records the overall severity of dying neuron issues of neural network for comparison across models, while the neuron entropy measures the volatility of single neurons for understanding the unit-wise model behaviour in a model. Apart from the expressive ability, this work also investigates the robustness of models by decomposing the computational graph of neural network using the proposed framework. In particular, it shows that the unsecured data can be categorized into Lipschitz vulnerability and float vulnerability according to the source of instability. Based on the insights of theoretical analysis, this work introduces two downstream applications of the proposed framework. The neuron entropy pruning (NEP) computes the importance score of parameter by integrating the neuron entropy and removes the unimportant parameters to reduce the model scale. The smoothed classifier with reformed float path in dual direction (SCRFP-2) reforms the training and prediction based on the smoothed classifier such that it increases the robustness of the model. Both of the models outperform the benchmark, which further supports the theoretical analysis presented by this work.