A Self-explaining Neural Architecture for Generalizable Concept Learning (2405.00349v2)
Abstract: With the wide proliferation of Deep Neural Networks in high-stake applications, there is a growing demand for explainability behind their decision-making process. Concept learning models attempt to learn high-level 'concepts' - abstract entities that align with human understanding, and thus provide interpretability to DNN architectures. However, in this paper, we demonstrate that present SOTA concept learning approaches suffer from two major problems - lack of concept fidelity wherein the models fail to learn consistent concepts among similar classes and limited concept interoperability wherein the models fail to generalize learned concepts to new domains for the same task. Keeping these in mind, we propose a novel self-explaining architecture for concept learning across domains which - i) incorporates a new concept saliency network for representative concept selection, ii) utilizes contrastive learning to capture representative domain invariant concepts, and iii) uses a novel prototype-based concept grounding regularization to improve concept alignment across domains. We demonstrate the efficacy of our proposed approach over current SOTA concept learning approaches on four widely used real-world datasets. Empirical results show that our method improves both concept fidelity measured through concept overlap and concept interoperability measured through domain adaptation performance.
- Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ digital medicine, 4(1):1–23, 2021.
- Towards robust interpretability with self-explaining neural networks. arXiv preprint arXiv:1806.07538, 2018.
- Debiasing concept bottleneck models with instrumental variables. arXiv preprint arXiv:2007.11500, 2020.
- Machine learning explainability in finance: an application to default risk analysis. 2019.
- Explaining neural networks semantically and quantitatively. In ICCV, pages 9187–9196, 2019.
- Ting Chen. A simple framework for contrastive learning of visual representations. In ICML. PMLR, 2020.
- Underspecification presents challenges for credibility in modern machine learning. JMLR, 2020.
- Omar Elbaghdadi. Disenn: Self-explaining neural networks: A review. https://github.com/AmanDaVinci/SENN, 2020.
- Data shapley: Equitable valuation of data for machine learning. In ICML, pages 2242–2251. PMLR, 2019.
- Towards automatic concept-based explanations. NeurIPS, 2019.
- Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
- Explaining classifiers with causal concept effect (cace). arXiv preprint arXiv:1907.07165, 2019.
- Conceptexplainer: Understanding the mental model of deep learning algorithms via interactive concept-based explanations. arXiv preprint arXiv:2204.01888, 2022.
- Jonathan J. Hull. A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence, 16(5):550–554, 1994.
- 50 years of test (un) fairness: Lessons for machine learning. In FAccT, pages 49–58, 2019.
- Automatic concept extraction for concept bottleneck-based video classification. 2021.
- Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In ICML, pages 2668–2677. PMLR, 2018.
- Understanding black-box predictions via influence functions. In ICML. PMLR, 2017.
- Concept bottleneck models. In ICML, pages 5338–5348. PMLR, 2020.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Coherence evaluation of visual concepts with objects and language. In ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality, 2022.
- Advances in deep learning-based medical image analysis. Health Data Science, 2021, 2021.
- Concept-based model explanations for electronic health records. In CHIL, pages 36–46, 2021.
- Expbert: Representation engineering with natural language explanations. In ACL, pages 2106–2113, 2020.
- Reading digits in natural images with unsupervised feature learning. 2011.
- Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.
- Moment matching for multi-source domain adaptation. In ICCV, pages 1406–1415, 2019.
- Hierarchical concept bottleneck models for explainable images segmentation, objects fine classification and tracking. Objects Fine Classification and Tracking, 2021.
- Closing the ai accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness, accountability, and transparency, pages 33–44, 2020.
- Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016.
- Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
- Universal domain adaptation through self supervision. NeurIPS, 33:16282–16292, 2020.
- A framework for learning ante-hoc explainable models via concepts. In CVPR, pages 10286–10295, 2022.
- Yoshihide Sawada. C-senn: Contrastive senn. 2022.
- Yoshihide Sawada. Cbm with add. unsup concepts. 2022.
- Axiomatic attribution for deep networks. In ICML, pages 3319–3328. PMLR, 2017.
- Facing the challenges of developing fair risk scoring models. Frontiers in artificial intelligence, 4, 2021.
- Contrastive domain adaptation. In CVPR, pages 2209–2218, 2021.
- Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ digital medicine, 5(1):1–8, 2022.
- Deep hashing network for unsupervised domain adaptation. In CVPR, pages 5018–5027, 2017.
- Understanding the behaviour of contrastive loss. In CVPR, pages 2495–2504, 2021.
- Bowen Wang. Learning bottleneck concepts in image classification. In CVPR, pages 10962–10971, 2023.
- Adrian Weller. Transparency: motivations and challenges. In Explainable AI: interpreting, explaining and visualizing deep learning, pages 23–40. Springer, 2019.
- Towards global explanations of convolutional neural networks with concept attribution. In CVPR, pages 8652–8661, 2020.
- Self-supervised domain adaptation for computer vision tasks. IEEE Access, 7:156694–156706, 2019.
- On completeness-aware concept-based explanations in deep neural networks. arXiv preprint arXiv:1910.07969, 2019.
- Semi-supervised domain adaptation with source label adaptation. In CVPR, pages 24100–24109, 2023.
- Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480, 2022.
- Cause and effect: Concept-based explanation of neural networks. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 2730–2736. IEEE, 2021.
- Interpretable basis decomposition for visual explanation. In ECCV, pages 119–134, 2018.
- Sanchit Sinha (11 papers)
- Guangzhi Xiong (18 papers)
- Aidong Zhang (49 papers)