GLIME: General, Stable and Local LIME Explanation (2311.15722v1)
Abstract: As black-box machine learning models grow in complexity and find applications in high-stakes scenarios, it is imperative to provide explanations for their predictions. Although Local Interpretable Model-agnostic Explanations (LIME) [22] is a widely adpoted method for understanding model behaviors, it is unstable with respect to random seeds [35,24,3] and exhibits low local fidelity (i.e., how well the explanation approximates the model's local behaviors) [21,16]. Our study shows that this instability problem stems from small sample weights, leading to the dominance of regularization and slow convergence. Additionally, LIME's sampling neighborhood is non-local and biased towards the reference, resulting in poor local fidelity and sensitivity to reference choice. To tackle these challenges, we introduce GLIME, an enhanced framework extending LIME and unifying several prior methods. Within the GLIME framework, we derive an equivalent formulation of LIME that achieves significantly faster convergence and improved stability. By employing a local and unbiased sampling distribution, GLIME generates explanations with higher local fidelity compared to LIME. GLIME explanations are independent of reference choice. Moreover, GLIME offers users the flexibility to choose a sampling distribution based on their specific scenarios.
- Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82–115, 2020.
- How to explain individual classification decisions. The Journal of Machine Learning Research, 11:1803–1831, 2010.
- Sam: The sensitivity of attribution methods to hyperparameters. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 8673–8683, 2020.
- Layer-wise relevance propagation for neural networks with local renormalization layers. In Artificial Neural Networks and Machine Learning–ICANN 2016: 25th International Conference on Artificial Neural Networks, Barcelona, Spain, September 6-9, 2016, Proceedings, Part II 25, pages 63–71. Springer, 2016.
- Explaining by removing: A unified framework for model explanation. The Journal of Machine Learning Research, 22(1):9477–9566, 2021.
- Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature machine intelligence, 3(7):620–631, 2021.
- Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE international conference on computer vision, pages 3429–3437, 2017.
- What does lime really see in images? In International conference on machine learning, pages 3620–3629. PMLR, 2021.
- Damien Garreau and Ulrike von Luxburg. Looking deeper into tabular lime. arXiv preprint arXiv:2008.11092, 2020.
- Explanation-driven deep learning model for prediction of brain tumour status using mri image data. Frontiers in Genetics, page 448, 2022.
- Interpretable credit application predictions with counterfactual explanations. arXiv preprint arXiv:1811.05245, 2018.
- Which explanation should i choose? a function approximation perspective to characterizing post hoc explanations. arXiv preprint arXiv:2206.01254, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Missingness bias in model debugging. arXiv preprint arXiv:2204.08945, 2022.
- Xrai: Better attributions through regions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4948–4957, 2019.
- Defining locality for surrogates in post-hoc interpretablity. arXiv preprint arXiv:1806.07498, 2018.
- Stein’s lemma for the reparameterization trick with exponential family mixtures. arXiv preprint arXiv:1910.13398, 2019.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
- Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.
- Amir Hossein Akhavan Rahnama and Henrik Boström. A study of data and label shift in the lime framework. arXiv preprint arXiv:1910.14421, 2019.
- " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Alime: Autoencoder based approach for local interpretability. In Intelligent Data Engineering and Automated Learning–IDEAL 2019: 20th International Conference, Manchester, UK, November 14–16, 2019, Proceedings, Part I 20, pages 454–463. Springer, 2019.
- A modified perturbed sampling method for local interpretable model-agnostic explanation. arXiv preprint arXiv:2002.07434, 2020.
- Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.
- Visualizing the impact of feature attribution baselines. Distill, 5(1):e22, 2020.
- Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
- Joel A Tropp et al. An introduction to matrix concentration inequalities. Foundations and Trends® in Machine Learning, 8(1-2):1–230, 2015.
- Quick shift and kernel methods for mode seeking. In Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part IV 10, pages 705–718. Springer, 2008.
- Optilime: Optimized lime explanations for diagnostic computer algorithms. arXiv preprint arXiv:2006.05714, 2020.
- Statistical stability indices for lime: Obtaining reliable explanations for machine learning models. Journal of the Operational Research Society, 73(1):91–101, 2022.
- Dlime: A deterministic local interpretable model-agnostic explanations approach for computer-aided diagnosis systems. arXiv preprint arXiv:1906.10263, 2019.
- Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pages 818–833. Springer, 2014.
- Consistent and truthful interpretation with fourier analysis. arXiv preprint arXiv:2210.17426, 2022.
- " why should you trust my explanation?" understanding uncertainty in lime explanations. arXiv preprint arXiv:1904.12991, 2019.
- Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
- S-lime: Stabilized-lime for model explanation. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 2429–2438, 2021.