2000 character limit reached
Four Axiomatic Characterizations of the Integrated Gradients Attribution Method (2306.13753v1)
Published 23 Jun 2023 in cs.LG
Abstract: Deep neural networks have produced significant progress among machine learning models in terms of accuracy and functionality, but their inner workings are still largely unknown. Attribution methods seek to shine a light on these "black box" models by indicating how much each input contributed to a model's outputs. The Integrated Gradients (IG) method is a state of the art baseline attribution method in the axiomatic vein, meaning it is designed to conform to particular principles of attributions. We present four axiomatic characterizations of IG, establishing IG as the unique method to satisfy different sets of axioms among a class of attribution methods.
- Monotonicity and the aumann–shapley cost-sharing method in the discrete case. European Journal of Operational Research, 238(2):560–565, 2014.
- Values of Non-Atomic Games. Princeton University Press, Princeton, NJ, 1974.
- Allocation of shared costs: A set of axioms yielding a unique procedure. Mathematics of Operations Research, 7(1):32–39, 1982.
- Layer-wise relevance propagation for neural networks with local renormalization layers. In International Conference on Artificial Neural Networks, pages 63–71. Springer, 2016.
- Algorithms to estimate shapley value feature attributions. Nature Machine Intelligence, pages 1–12, 2023.
- European Commission. Proposal for an artificial intelligence act, 2021. 2021/0106(COD), Article 13, Accessed May 22, 2023.
- A value for multichoice games. Mathematical Social Sciences, 40(3):341–354, 2000.
- Nadine Dorries. Establishing a pro-innovation approach to regulating ai, 2022.
- Three methods to share joint costs or surplus. Journal of economic Theory, 87(2):275–312, 1999.
- The White House. Blueprint for an ai bill of rights, 2022. Accessed May 22, 2023, Section: Notice and Explanation.
- Global explanations of neural networks: Mapping the landscape of predictions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 279–287, 2019.
- Explaining explanations: Axiomatic feature interactions for deep networks. J. Mach. Learn. Res., 22:104–1, 2021.
- A rigorous study of integrated gradients method and extensions to internal neuron attributions. In International Conference on Machine Learning, pages 14485–14508. PMLR, 2022.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
- Symmetry-preserving paths in integrated gradients. arXiv preprint arXiv:2103.13533, 2021.
- Explainable ai: A review of machine learning interpretability methods. Entropy, 23(1):18, 2020.
- Distributing synergy functions: Unifying game-theoretic interaction methods for machine-learning explainability. arXiv preprint arXiv:2305.03100, 2023.
- Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3):1350 – 1371, 2015.
- Values of smooth nonatomic games: the method of multilinear approximation. Cambridge University Press, Cambridge, 1988.
- Potential, consistency, and cost allocation prices. Mathematics of Operations Research, 29(3):602–623, 2004.
- Demand compatible equitable cost sharing prices. Mathematics of Operations Research, 7(1):40–56, 1982.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- The shapley taylor interaction index. In International conference on machine learning, pages 9259–9268. PMLR, 2020.
- Learning important features through propagating activation differences. In International Conference on Machine Learning, pages 3145–3153. PMLR, 2017.
- The many shapley values for model explanation. In International conference on machine learning, pages 9269–9278. PMLR, 2020.
- Yves Sprumont. On the discrete version of the aumann–shapley cost-sharing method. Econometrica, 73(5):1693–1712, 2005.
- The assignment game i: The core. International Journal of game theory, 1(1):111–130, 1971.
- The determination of marginal cost prices under a set of axioms. Econometrica: Journal of the Econometric Society, pages 895–909, 1982.
- Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.
- Axiomatic attribution for deep networks. In International Conference on Machine Learning, pages 3319–3328. PMLR, 2017.
- Faith-shap: The faithful shapley interaction index. arXiv preprint arXiv:2203.00870, 2022.
- Explaining the deep natural language processing by mining textual interpretable features. arXiv preprint arXiv:2106.06697, 2021.
- Jesse Vig. A multiscale visualization of attention in the transformer model. arXiv preprint arXiv:1906.05714, 2019.
- H Peyton Young. Producer incentives in cost allocation. Econometrica: Journal of the Econometric Society, pages 757–765, 1985.
- Visualizing and understanding convolutional networks. In European conference on computer vision, pages 818–833. Springer, 2014.