[Paper Review] Full-Gradient Representation for Neural Network Visualization
The paper introduces FullGrad, a full-gradient representation that attributes neural network outputs to both inputs and neurons, satisfying completeness and weak dependence, and presents an approximate FullGrad for convolutional nets with quantitative and qualitative evaluations."
We introduce a new tool for interpreting neural net responses, namely full-gradients, which decomposes the neural net response into input sensitivity and per-neuron sensitivity components. This is the first proposed representation which satisfies two key properties: completeness and weak dependence, which provably cannot be satisfied by any saliency map-based interpretability method. For convolutional nets, we also propose an approximate saliency map representation, called FullGrad, obtained by aggregating the full-gradient components. We experimentally evaluate the usefulness of FullGrad in explaining model behaviour with two quantitative tests: pixel perturbation and remove-and-retrain. Our experiments reveal that our method explains model behaviour correctly, and more comprehensively than other methods in the literature. Visual inspection also reveals that our saliency maps are sharper and more tightly confined to object regions than other methods.
Motivation & Objective
- Motivate the need for an attribution method that captures both input-level and neuron-level importance.
- Define weak dependence and completeness and show they cannot be satisfied simultaneously by traditional saliency maps.
- Introduce full-gradient representations that unify input-gradient and bias-gradient contributions.
- Propose FullGrad for convolutional nets to produce sharp, object-confined saliency maps.
- Evaluate FullGrad with pixel perturbation and remove-and-retrain tests to demonstrate improved faithfulness.
Proposed method
- Derive the full-gradient decomposition of network outputs into input-gradients and bias-gradients.
- Show that f(x; b) can be expressed as the sum of input-gradient term and bias-gradient term (f^b(x)).
- Aggregate per-neuron and per-layer bias-gradients into a network-wide saliency map for CNNs via a defined formula.
- Define a post-processing operator psi to convert full-gradient components into visual saliency maps (FullGrad).
- Provide an approximate FullGrad that combines input-gradient maps with aggregated bias-gradient maps across layers.
- Discuss visualization steps and the role of bias terms including implicit biases in networks.
Experimental results
Research questions
- RQ1Can a saliency representation satisfy both weak dependence on inputs and completeness simultaneously?
- RQ2How can we construct a more expressive attribution that includes both input features and neuron contributions?
- RQ3Does a full-gradient approach yield sharper, more localized saliency maps for CNNs compared to existing methods?
- RQ4Do full-gradient based saliency maps align better with model behavior under perturbation and retraining evaluations?
- RQ5What is the impact of post-processing choices on the effectiveness of FullGrad as a visualization tool?
Key findings
- Full-gradients provide a complete representation by reconstructing the network output from input-gradients and bias-gradients.
- For ReLU networks with biases, f(x) equals the inner product of input-gradients with inputs plus the inner product of bias-gradients with biases.
- FullGrad for CNNs yields spatial maps by visualizing bias-gradients with the same receptive-field structure as inputs, producing per-neuron and per-layer saliency maps.
- The proposed aggregation yields sharper saliency maps that are tightly confined to object regions while also outlining interior structures.
- Quantitative evaluations (pixel perturbation and ROAR-like tests) show FullGrad outperforming several existing saliency methods in faithfulness to model behavior.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.