Estimation of Concept Explanations Should be Uncertainty Aware
Abstract: Model explanations can be valuable for interpreting and debugging predictive models. We study a specific kind called Concept Explanations, where the goal is to interpret a model using human-understandable concepts. Although popular for their easy interpretation, concept explanations are known to be noisy. We begin our work by identifying various sources of uncertainty in the estimation pipeline that lead to such noise. We then propose an uncertainty-aware Bayesian estimation method to address these issues, which readily improved the quality of explanations. We demonstrate with theoretical analysis and empirical evaluation that explanations computed by our method are robust to train-time choices while also being label-efficient. Further, our method proved capable of recovering relevant concepts amongst a bank of thousands, in an evaluation with real-datasets and off-the-shelf models, demonstrating its scalability. We believe the improved quality of uncertainty-aware concept explanations make them a strong candidate for more reliable model interpretation. We release our code at https://github.com/vps-anonconfs/uace.
- From” where” to” what”: Towards human-understandable explanations through concept relevance propagation. arXiv preprint arXiv:2206.03208, 2022.
- Selective concept models: Permitting stakeholder customisation at test-time. arXiv preprint arXiv:2306.08424, 2023.
- Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6541–6549, 2017a.
- Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6541–6549, 2017b.
- Pyro: Deep universal probabilistic programming. J. Mach. Learn. Res., 20:28:1–28:6, 2019. URL http://jmlr.org/papers/v20/18-403.html.
- Detect what you can: Detecting and representing objects using holistic models and body parts. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1971–1978, 2014.
- Concept-based explanations for out-of-distribution detectors. In International Conference on Machine Learning, pp. 5817–5837. PMLR, 2023.
- An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215–223. JMLR Workshop and Conference Proceedings, 2011.
- Human uncertainty in concept-based ai systems. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pp. 869–889, 2023.
- Concept embedding models: Beyond the accuracy-explainability trade-off. Advances in Neural Information Processing Systems, 35:21400–21413, 2022.
- Towards automatic concept-based explanations. Advances in neural information processing systems, 32, 2019.
- Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022.
- Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International conference on machine learning, pp. 2668–2677. PMLR, 2018.
- Probabilistic concept bottleneck models. arXiv preprint arXiv:2306.01574, 2023a.
- ” help me help the ai”: Understanding how explainability can support human-ai interaction. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–17, 2023b.
- Understanding black-box predictions via influence functions. In International conference on machine learning, pp. 1885–1894. PMLR, 2017.
- Concept bottleneck models. In International Conference on Machine Learning, pp. 5338–5348. PMLR, 2020.
- Glancenets: Interpretable, leak-proof concept-based models. Advances in Neural Information Processing Systems, 35:21212–21227, 2022.
- Text-to-concept (and back) via cross-model alignment. arXiv preprint arXiv:2305.06386, 2023.
- Label-free concept bottleneck models. In International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=FlCg47MNvBA.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- Overlooked factors in concept-based explanations: Dataset choice, concept salience, and human capability. arXiv preprint arXiv:2207.09615, 2022a.
- Elude: Generating interpretable explanations via a decomposition into labelled and unlabelled features. arXiv preprint arXiv:2206.07690, 2022b.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144, 2016.
- Russ Salakhutdinov. Statistical machine learning, 2011. URL https://www.utstat.toronto.edu/~rsalakhu/sta4273/notes/Lecture2.pdf#page=10.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626, 2017.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Wikipedia. Cosine similarity — Wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=Cosine%20similarity&oldid=1178409159, 2023a. [Online; accessed 18-November-2023].
- Wikipedia. Kendall tau distance — Wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=Kendall%20tau%20distance&oldid=1163706720, 2023b. [Online; accessed 25-September-2023].
- Causal proxy models for concept-based model explanations. In International Conference on Machine Learning, pp. 37313–37334. PMLR, 2023.
- Human-centered concept explanations for neural networks. arXiv preprint arXiv:2202.12451, 2022.
- Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480, 2022.
- Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017a.
- Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 633–641, 2017b.
- Interpretable basis decomposition for visual explanation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 119–134, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.