Guarantee Regions for Local Explanations (2402.12737v1)
Abstract: Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct by explicitly describing those intervals along which the input features can be trusted. Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model. We demonstrate that our algorithm can be used to find explanations with larger guarantee regions that better cover the data manifold compared to existing baselines. We also show how our method can identify misleading local explanations with significantly poorer guarantee regions.
- A note on finding a maximum empty rectangle. Discrete Applied Mathematics, 13(1):87–91.
- Maximum-weight planar boxes in o (n2) time (and better). Information Processing Letters, 114(8):437–445.
- Interpretable regional descriptors: Hyperbox-based local explanations. arXiv preprint arXiv:2305.02780.
- The maximum box problem and its application to data analysis. Computational Optimization and Applications, 23(3):285–298.
- Abduction-based explanations for machine learning models. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19. AAAI Press.
- On validating, repairing and refining heuristic ml explanations. ArXiv, abs/1907.02509.
- Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR.
- Planar case of the maximum box and related problems. In CCCG, volume 3, pages 11–13.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144.
- Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Grad-cam: Why did you say that? arXiv preprint arXiv:1611.07450.
- Maire-a model-agnostic interpretable rule extraction procedure for explaining classifiers. In Machine Learning and Knowledge Extraction: 5th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2021, Virtual Event, August 17–20, 2021, Proceedings 5, pages 329–349. Springer.
- A symbolic approach to explaining bayesian network classifiers. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 5103–5111. International Joint Conferences on Artificial Intelligence Organization.
- Programs as black-box explanations. arXiv preprint arXiv:1611.07579.
- Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 180–186.
- Visualizing data using t-sne. Journal of machine learning research, 9(11).
- Marton Havasi (18 papers)
- Sonali Parbhoo (35 papers)
- Finale Doshi-Velez (134 papers)