Learning Important Features Through Propagating Activation Differences (1704.02685v2)

Published 10 Apr 2017 in cs.CV, cs.LG, and cs.NE

Abstract: The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches. Scores can be computed efficiently in a single backward pass. We apply DeepLIFT to models trained on MNIST and simulated genomic data, and show significant advantages over gradient-based methods. Video tutorial: http://goo.gl/qKb7pL, ICML slides: bit.ly/deeplifticmlslides, ICML talk: https://vimeo.com/238275076, code: http://goo.gl/RM8jvH.

Citations (3,598)

View on Semantic Scholar

Summary

The paper introduces DeepLIFT, a method that decomposes neural network predictions by propagating differences from a reference state.
It efficiently separates positive and negative contributions, addressing zero-gradient and discontinuity challenges inherent to gradient-based methods.
Experimental validation on MNIST and genomic sequence tasks demonstrates DeepLIFT’s superior ability to identify key features compared to traditional approaches.

DeepLIFT: Learning Importance Through Propagating Activation Differences

The paper entitled "Learning Important Features Through Propagating Activation Differences" introduces DeepLIFT (Deep Learning Important FeaTures), a novel approach for interpreting the predictions of neural networks. This method seeks to address a critical challenge in machine learning: the "black box" nature of neural networks, which impedes their adoption in applications requiring interpretability.

Key Contributions

DeepLIFT decomposes the output prediction of a neural network in terms of the contributions of individual input features by propagating differences in activation from a defined 'reference' state. This approach offers two significant innovations over existing methods:

Difference-from-Reference Framework: DeepLIFT frames importance in terms of deviations from a reference state, avoiding the pitfalls associated with zero gradients and gradient discontinuities.
Positive and Negative Contribution Separation: By treating positive and negative contributions separately, DeepLIFT can uncover dependencies that other approaches might miss.

The algorithm computes contributions efficiently using a backpropagation-like method, allowing for the quick generation of importance scores after a prediction has been made.

Comparison with Existing Approaches

The paper critically compares DeepLIFT with various perturbation-based and backpropagation-based approaches for assigning importance scores:

Perturbation-Based Approaches: These methods involve manipulating individual inputs and observing changes in network output. While insightful, they are computationally intensive and can fail to model feature saturation problems.
Gradient-Based Approaches: Methods like vanilla gradients, deconvolutional networks, and guided backpropagation use gradients to propagate importance signals backward through the network. These methods are efficient but suffer from zero-gradient issues and artifacts caused by discontinuities.

Among backpropagation-based methods, Layerwise Relevance Propagation (LRP) and integrated gradients offer closer comparisons to DeepLIFT. Integrated gradients, while addressing saturation and discontinuity issues, involve significant computational overhead. In contrast, DeepLIFT provides an efficient low-cost alternative suitable for practical implementation.

Experimental Validation

MNIST Digit Classification

DeepLIFT was tested using a convolutional neural network trained on the MNIST dataset. The goal was to identify pixels to erase in an image to transform it into a different digit while evaluating the change in log-odds score for different methods. DeepLIFT, especially with the RevealCancel rule, outperformed other methods like integrated gradients, gradient $\times$ input, and guided backpropagation by better identifying the most impactful pixels.

Genomic Sequence Classification

The method was also applied to classifying regulatory DNA sequences. Here, sequences were embedded with motifs representing binding sites for regulatory proteins. Different classification tasks aimed to identify motifs for specific proteins, or combinations thereof. DeepLIFT successfully discerned the relevant motifs and their cooperativity, surpassing other methods in identifying key features across multiple tasks. For instance, DeepLIFT with the RevealCancel rule identified dependencies between motifs more effectively than gradient-based methods.

Discussion and Implications

DeepLIFT offers substantial improvements in interpretability over traditional gradient-based methods. By using a difference-from-reference approach, it provides a more robust and continuous measure of feature importance, mitigating issues where gradients might be zero or discontinuous. This can be particularly advantageous in domains requiring high interpretability, such as genomics and healthcare.

Potential future directions include optimizing the choice of reference inputs empirically and extending DeepLIFT's application to Recurrent Neural Networks (RNNs) and models using max operations, such as Maxout networks or pooling layers. Addressing these aspects will enhance the versatility and robustness of the DeepLIFT framework.

Conclusion

The introduction of DeepLIFT marks a significant step toward more interpretable neural networks. By effectively propagating activation differences, it provides granular insights into feature importance, paving the way for broader deployment of neural networks in interpretability-critical applications.

PDF Markdown

Related Papers

Tweets

https://twitter.com/jmschreiber91/status/1796570136130060605

YouTube

Show All Videos