Trade-off Between Efficiency and Consistency for Removal-based Explanations (2210.17426v3)
Abstract: In the current landscape of explanation methodologies, most predominant approaches, such as SHAP and LIME, employ removal-based techniques to evaluate the impact of individual features by simulating various scenarios with specific features omitted. Nonetheless, these methods primarily emphasize efficiency in the original context, often resulting in general inconsistencies. In this paper, we demonstrate that such inconsistency is an inherent aspect of these approaches by establishing the Impossible Trinity Theorem, which posits that interpretability, efficiency, and consistency cannot hold simultaneously. Recognizing that the attainment of an ideal explanation remains elusive, we propose the utilization of interpretation error as a metric to gauge inefficiencies and inconsistencies. To this end, we present two novel algorithms founded on the standard polynomial basis, aimed at minimizing interpretation error. Our empirical findings indicate that the proposed methods achieve a substantial reduction in interpretation error, up to 31.8 times lower when compared to alternative techniques. Code is available at https://github.com/trusty-ai/efficient-consistent-explanations.
- Explaining individual predictions when features are dependent: More accurate approximations to shapley values. Artificial Intelligence, 298:103502, 2021.
- Slic superpixels. Technical report, EPFL, 2010.
- Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
- Improved svrg for non-strongly-convex or sum-of-non-convex objectives. In International Conference on Machine Learning, pp. 1080–1089. PMLR, 2016.
- Explaining deep neural networks with a polynomial time algorithm for shapley value approximation. In International Conference on Machine Learning, pp. 272–281. PMLR, 2019.
- On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.
- Randall Balestriero. Neural decision trees. ArXiv, abs/1702.07360, 2017.
- Interpretability via model extraction. ArXiv, abs/1706.09773, 2017.
- William Beckner. Inequalities in fourier analysis. Annals of Mathematics, 102(1):159–182, 1975.
- Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
- Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 9650–9660, 2021.
- Transformer interpretability beyond attention visualization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 782–791, 2021.
- An interpretable model with globally consistent explanations for credit risk. ArXiv, abs/1811.12615, 2018.
- L-shapley and c-shapley: Efficient model interpretation for structured data. ArXiv, abs/1808.02610, 2019.
- Improving kernelshap: Practical shapley value estimation using linear regression. In International Conference on Artificial Intelligence and Statistics, pp. 3457–3465. PMLR, 2021.
- Understanding global feature contributions with additive importance measures. Advances in Neural Information Processing Systems, 33:17212–17223, 2020.
- Explaining by removing: A unified framework for model explanation. The Journal of Machine Learning Research, 22:209–1, 2021.
- Information theoretic inequalities. IEEE Transactions on Information theory, 37(6):1501–1518, 1991.
- What does explainable ai really mean? a new conceptualization of perspectives. ArXiv, abs/1710.00794, 2017.
- Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE international conference on computer vision, pp. 3429–3437, 2017.
- Three methods to share joint costs or surplus. Journal of Economic Theory, 87(2):275–312, 1999.
- Asymmetric shapley values: incorporating causal knowledge into model-agnostic explainability. Advances in Neural Information Processing Systems, 33:1229–1239, 2020.
- An axiomatic approach to the concept of interaction among players in cooperative games. International Journal of Game Theory, 28:547–565, 1999.
- Testing booleanity and the uncertainty principle. arXiv preprint arXiv:1204.0944, 2012.
- Axiomatic explanations for visual search, retrieval, and similarity learning. In International Conference on Learning Representations, 2022.
- Hyperparameter optimization: A spectral approach. ArXiv, abs/1706.00764, 2018.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Isidore I Hirschman. A note on entropy. American journal of mathematics, 79(1):152–156, 1957.
- Explaining explanations: Axiomatic feature interactions for deep networks. J. Mach. Learn. Res., 22:104:1–104:54, 2021.
- Fastshap: Real-time shapley value estimation. In International Conference on Learning Representations, 2021.
- Adam: A method for stochastic optimization. Arxiv, abs/1412.6980, 2015.
- Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60:84 – 90, 2012.
- Faithful and customizable explanations of black box models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 131–138, 2019.
- Microsoft coco: Common objects in context, 2015.
- Constant depth circuits, fourier transform, and learnability. Journal of the ACM (JACM), 40(3):607–620, 1993.
- Zachary C Lipton. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018.
- A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 2017.
- Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888, 2018.
- Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pp. 142–150, 2011.
- Explanations of black-box models based on directional feature interactions. In International Conference on Learning Representations, 2022.
- Explaining nonlinear classification decisions with deep taylor decomposition. Pattern recognition, 65:211–222, 2017.
- Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014.
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
- Guillermo Owen. Multilinear extensions of games. Management Science, 18(5-part-2):64–79, 1972.
- Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing, pp. 1532–1543, 2014.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144, 2016.
- Cxplain: Causal explanations for model interpretation under uncertainty. Advances in Neural Information Processing Systems, 32, 2019.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128:336–359, 2017.
- Lloyd S. Shapley. Quota solutions of n-person games. Edited by Emil Artin and Marston Morse, pp. 343, 1953.
- Not just a black box: Learning important features through propagating activation differences. CoRR, abs/1605.01713, 2016.
- Learning important features through propagating activation differences. In International Conference on Machine Learning, pp. 3145–3153. PMLR, 2017.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, abs/1312.6034, 2014.
- Smoothgrad: removing noise by adding noise. ArXiv, abs/1706.03825, 2017.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pp. 1631–1642, 2013.
- Real analysis: measure theory, integration, and Hilbert spaces. Princeton University Press, 2009.
- Explaining prediction models and individual predictions with feature contributions. Knowledge and information systems, 41(3):647–665, 2014.
- Interpretable two-level boolean rule learning for classification. ArXiv, abs/1511.07361, 2015.
- The many shapley values for model explanation. In International Conference on Machine Learning, pp. 9269–9278. PMLR, 2020.
- Axiomatic attribution for deep networks. In International Conference on Machine Learning, pp. 3319–3328. PMLR, 2017.
- The shapley taylor interaction index. In International conference on machine learning, pp. 9259–9268. PMLR, 2020.
- Faith-shap: The faithful shapley shapley interaction index. arXiv preprint arXiv:2203.00870, 2022.
- How does this interaction affect me? interpretable attribution for feature interactions. Advances in Neural Information Processing Systems, 33:6147–6159, 2020.
- Methods and models for interpretable linear classification. ArXiv, abs/1405.4047, 2014.
- Supersparse linear integer models for interpretable classification. ArXiv, abs/1306.6677, 2014.
- Trading interpretability for accuracy: Oblique treed sparse additive models. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254, 2015a.
- Shapley flow: A graph-based approach to interpreting model predictions. In International Conference on Artificial Intelligence and Statistics, pp. 721–729. PMLR, 2021a.
- Shapley explanation networks. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021b.
- Or’s of and’s for interpretable classification, with application to context-aware recommender systems. ArXiv, abs/1504.07614, 2015b.
- A bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research, 18(1):2357–2393, 2017.
- Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
- Robert J Weber. Probabilistic values for games, the shapley value. essays in honor of lloyd s. shapley (ae roth, ed.), 1988.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
- Attribution in scale and space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 9677–9686. Computer Vision Foundation / IEEE, 2020. doi: 10.1109/CVPR42600.2020.00970.
- Deep neural decision trees. ArXiv, abs/1806.06988, 2018.
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
- On explainability of graph neural networks via subgraph explorations. In International Conference on Machine Learning, pp. 12241–12252. PMLR, 2021.
- Interpreting multivariate shapley interactions in dnns. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 10877–10886, 2021.
- Cumulative reasoning with large language models. arXiv preprint arXiv:2308.04371, 2023.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.