LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class (2310.04856v2)
Abstract: In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output by the widely used complex classification models but also provides insight into how every feature deemed to be important affects the prediction probability for each of the possible classes. We achieve this by defining the explanation as a matrix obtained via regression with respect to the Hellinger distance in the space of probability distributions. Ablation tests on text and image data, show that LIPEx-guided removal of important features from the data causes more change in predictions for the underlying model than similar tests based on other saliency-based or feature importance-based Explainable AI (XAI) methods. It is also shown that compared to LIME, LIPEx is more data efficient in terms of using a lesser number of perturbations of the data to obtain a reliable explanation. This data-efficiency is seen to manifest as LIPEx being able to compute its explanation matrix around 53% faster than all-class LIME, for classification experiments with text data.
- Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
- Towards the unification and robustness of perturbation and gradient based explanations. In International Conference on Machine Learning, pp. 110–119. PMLR, 2021.
- A diagnostic study of explainability techniques for text classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3256–3274, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.263. URL https://aclanthology.org/2020.emnlp-main.263.
- Accuracy Improvements in Linguistic Fuzzy Modeling. 01 2003. ISBN 978-3-642-05703-8. doi: 10.1007/978-3-540-37058-1.
- Variational information pursuit for interpretable predictions. arXiv preprint arXiv:2302.02876, 2023.
- Explainable ai tools for legal reasoning about cases: A study on the european court of human rights. Artificial Intelligence, 317:103861, 2023.
- Jonathan Crabbé and Mihaela van der Schaar. Label-free explainability for unsupervised models. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp. 4391–4420. PMLR, 2022. URL https://proceedings.mlr.press/v162/crabbe22a.html.
- Debugging machine learning models. 2016.
- Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, pp. 590–601. Curran Associates Inc., 2018.
- Towards a rigorous science of interpretable machine learning, 2017.
- Poly-CAM: High resolution class activation map for convolutional neural networks, 2022. URL https://openreview.net/forum?id=qnm-2v-baW.
- Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009.
- Understanding deep networks via extremal perturbations and smooth masks. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2950–2958, 2019. doi: 10.1109/ICCV.2019.00304.
- Explaining the explainer: A first theoretical analysis of lime. In International conference on artificial intelligence and statistics, pp. 1287–1296. PMLR, 2020.
- A simple technique to enable saliency methods to pass the sanity checks, 2020. URL https://openreview.net/forum?id=BJeGZxrFvS.
- New definitions and evaluations for saliency methods: Staying intrinsic, complete and sound. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 33120–33133. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/d6383e7643415842b48a5077a1b09c98-Paper-Conference.pdf.
- Contrastive explanations for model interpretability, 2021.
- Xrai: Better attributions through regions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4948–4957, 2019.
- Guided integrated gradients: An adaptive path method for removing noise. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5050–5058, 2021.
- Detecting climate signals using explainable ai with single-forcing large ensembles. 04 2021. doi: 10.1002/essoar.10505762.2.
- Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 1675–1684, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450342322. doi: 10.1145/2939672.2939874. URL https://doi.org/10.1145/2939672.2939874.
- Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3):1350 – 1371, 2015.
- Pdexplain: Contextual modeling of pdes in the wild, 2023.
- A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 4768–4777, Red Hook, NY, USA, 2017. Curran Associates Inc.
- Listwise explanations for ranking models using multiple explainers. In Jaap Kamps, Lorraine Goeuriot, Fabio Crestani, Maria Maistro, Hideo Joho, Brian Davis, Cathal Gurrin, Udo Kruschwitz, and Annalina Caputo (eds.), Advances in Information Retrieval - 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2-6, 2023, Proceedings, Part I, volume 13980 of Lecture Notes in Computer Science, pp. 653–668. Springer, 2023.
- Explainable ai for high energy physics, 2022.
- Visualizing deep networks by optimizing with integrated gradients, 05 2019.
- Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 12187–12198. Curran Associates, Inc., 2020.
- "why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144, 2016.
- Anchors: High-precision model-agnostic explanations. AAAI’18/IAAI’18/EAAI’18. AAAI Press, 2018.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626, 2017.
- Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
- Deep inside convolutional networks: Visualising image classification models and saliency maps, 2014.
- Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, AIES ’20, pp. 180–186, New York, NY, USA, 2020. Association for Computing Machinery.
- Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017a.
- Smoothgrad: removing noise by adding noise. 06 2017b.
- Limetree: Consistent and faithful surrogate explanations of multiple classes. arXiv preprint arXiv:2005.01427, 2020.
- Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.
- Axiomatic attribution for deep networks. In International conference on machine learning, pp. 3319–3328. PMLR, 2017.
- Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, pp. 10–19, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450361255.
- Counterfactual explanations without opening the black box: Automated decisions and the GDPR. CoRR, abs/1711.00399, 2017.
- Isolating salient variations of interest in single-cell data with contrastivevi. Nature Methods, pp. 1–10, 2023.
- Attribution in scale and space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9680–9689, 2020.
- Interpreting language models with contrastive explanations. In Conference on Empirical Methods in Natural Language Processing, 2022. URL https://api.semanticscholar.org/CorpusID:247011700.
- Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pp. 818–833. Springer, 2014.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.