Eliminating Information Leakage in Hard Concept Bottleneck Models with Supervised, Hierarchical Concept Learning (2402.05945v1)
Abstract: Concept Bottleneck Models (CBMs) aim to deliver interpretable and interventionable predictions by bridging features and labels with human-understandable concepts. While recent CBMs show promising potential, they suffer from information leakage, where unintended information beyond the concepts (either when concepts are represented with probabilities or binary states) are leaked to the subsequent label prediction. Consequently, distinct classes are falsely classified via indistinguishable concepts, undermining the interpretation and intervention of CBMs. This paper alleviates the information leakage issue by introducing label supervision in concept predication and constructing a hierarchical concept set. Accordingly, we propose a new paradigm of CBMs, namely SupCBM, which achieves label predication via predicted concepts and a deliberately-designed intervention matrix. SupCBM focuses on concepts that are mostly relevant to the predicted label and only distinguishes classes when different concepts are presented. Our evaluations show that SupCBM outperforms SOTA CBMs over diverse datasets. It also manifests better generality across different backbone models. With proper quantification of information leakage in different CBMs, we demonstrate that SupCBM significantly reduces the information leakage.
- Disparities in dermatology ai: assessments using diverse clinical images. arXiv preprint arXiv:2111.08006, 2021.
- Ganspace: Discovering interpretable gan controls. Advances in neural information processing systems, 33:9841–9850, 2020.
- Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022.
- Concept bottleneck models. In International conference on machine learning, pp. 5338–5348. PMLR, 2020.
- Learning multiple layers of features from tiny images. 2009.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
- Promises and pitfalls of black-box concept learning models. arXiv preprint arXiv:2106.13314, 2021.
- Automated flower classification over a large number of classes. In 2008 Sixth Indian conference on computer vision, graphics & image processing, pp. 722–729. IEEE, 2008.
- Label-free concept bottleneck models. arXiv preprint arXiv:2304.06129, 2023.
- “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144, 2016.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626, 2017.
- Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9243–9252, 2020.
- Learning important features through propagating activation differences. In International conference on machine learning, pp. 3145–3153. PMLR, 2017.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- Provable repair of deep neural networks. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 588–603, 2021.
- Explain any concept: Segment anything meets concept-based explanation. Advances in Neural Information Processing Systems, 2023.
- Causality-based neural network repair. In Proceedings of the 44th International Conference on Software Engineering, pp. 338–349, 2022.
- The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018.
- Nn repair: Constraint-based repair of neural network classifiers. In Computer Aided Verification: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I 33, pp. 3–25. Springer, 2021.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Learning concise and descriptive attributes for visual recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3090–3100, 2023.
- Language in a bottle: Language model guided concept bottlenecks for interpretable image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19187–19197, 2023.
- Post-hoc concept bottleneck models. In The Eleventh International Conference on Learning Representations, 2022.
- Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2921–2929, 2016.
- Ao Sun (53 papers)
- Yuanyuan Yuan (15 papers)
- Pingchuan Ma (91 papers)
- Shuai Wang (466 papers)