Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A survey on Concept-based Approaches For Model Improvement (2403.14566v2)

Published 21 Mar 2024 in cs.AI and cs.LG

Abstract: The focus of recent research has shifted from merely improving the metrics based performance of Deep Neural Networks (DNNs) to DNNs which are more interpretable to humans. The field of eXplainable Artificial Intelligence (XAI) has observed various techniques, including saliency-based and concept-based approaches. These approaches explain the model's decisions in simple human understandable terms called Concepts. Concepts are known to be the thinking ground of humans}. Explanations in terms of concepts enable detecting spurious correlations, inherent biases, or clever-hans. With the advent of concept-based explanations, a range of concept representation methods and automatic concept discovery algorithms have been introduced. Some recent works also use concepts for model improvement in terms of interpretability and generalization. We provide a systematic review and taxonomy of various concept representations and their discovery algorithms in DNNs, specifically in vision. We also provide details on concept-based model improvement literature marking the first comprehensive survey of these methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (136)
  1. Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation. ArXiv abs/2212.04629 (2022).
  2. Naveed Akhtar. 2023. A Survey of Explainable AI in Deep Visual Modeling: Methods and Metrics. ArXiv abs/2301.13445 (2023).
  3. Cocox: Generating conceptual and counterfactual explanations via fault-lines. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 2594–2601.
  4. Adapting to Latent Subgroup Shifts via Concepts and Proxies. ArXiv abs/2212.11254 (2022).
  5. Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces. ArXiv abs/2206.05257 (2022).
  6. David Alvarez Melis and Tommi Jaakkola. 2018. Towards robust interpretability with self-explaining neural networks. Advances in neural information processing systems 31 (2018).
  7. Finding and removing clever hans: Using explanation methods to debug and improve deep models. Information Fusion 77 (2022), 261–295.
  8. Diffusion visual counterfactual explanations. arXiv preprint arXiv:2210.11841 (2022).
  9. Mohammad Taha Bahadori and David E Heckerman. 2020. Debiasing concept-based explanations with causal analysis. arXiv preprint arXiv:2007.11500 (2020).
  10. Concept Gradient: Concept-based Interpretation Without Linear Assumption. arXiv preprint arXiv:2208.14966 (2022).
  11. Interpretable neural-symbolic concept reasoning. In International Conference on Machine Learning. PMLR, 1801–1825.
  12. Network Dissection: Quantifying Interpretability of Deep Visual Representations. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), 3319–3327.
  13. Explainable machine learning with prior knowledge: An overview. arXiv preprint arXiv:2105.10172 (2021).
  14. Mood Board Search. https://github.com/google-research/mood-board-search. Accessed: [Insert date here].
  15. Åke Björck. 1994. Numerics of gram-schmidt orthogonalization. Linear Algebra and Its Applications 197 (1994), 297–316.
  16. Toward a Unified Framework for Debugging Concept-based Models.
  17. Concept-level debugging of part-prototype networks. (2023).
  18. Daniel T. Chang. 2018a. Concept-Oriented Deep Learning. ArXiv abs/1806.01756 (2018).
  19. Daniel T Chang. 2018b. Concept-Oriented Deep Learning: Generative Concept Representations. arXiv preprint arXiv:1811.06622 (2018).
  20. Leveraging conditional generative models in a general explanation framework of classifier decisions. Future Generation Computer Systems 132 (2022), 223–238.
  21. Interactive Concept Bottleneck Models. ArXiv abs/2212.07430 (2022).
  22. This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems 32 (2019).
  23. Concept whitening for interpretable image recognition. Nature Machine Intelligence 2, 12 (2020), 772–782.
  24. Human uncertainty in concept-based ai systems. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 869–889.
  25. Vargha Dadvar. 2022. POEM: pattern-oriented explanations of CNN models. Master’s thesis. University of Waterloo.
  26. State2explanation: Concept-based explanations to benefit agent learning and user understanding. Advances in Neural Information Processing Systems 36 (2023), 67156–67182.
  27. Jean Donham. 2010. Deep learning through concept-based inquiry. School Library Monthly 27, 1 (2010), 8–15.
  28. Techniques for interpretable machine learning. Commun. ACM 63, 1 (2019), 68–77.
  29. Ruth Fong and Andrea Vedaldi. 2018. Net2vec: Quantifying and explaining how concepts are encoded by filters in deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8730–8738.
  30. A Typology to Explore the Mitigation of Shortcut Behavior.
  31. Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning. arXiv preprint arXiv:2212.03954 (2022).
  32. RES: A Robust Framework for Guiding Visual Explanation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 432–442.
  33. Explainable active learning (xal) toward ai explanations as interfaces for machine teachers. Proceedings of the ACM on Human-Computer Interaction 4, CSCW3 (2021), 1–28.
  34. Dissect: Disentangled simultaneous explanations via concept traversals. arXiv preprint arXiv:2105.15164 (2021).
  35. Towards automatic concept-based explanations. Advances in Neural Information Processing Systems 32 (2019).
  36. Explaining Classifiers with Causal Concept Effect (CaCE). ArXiv abs/1907.07165 (2019).
  37. Concept-based understanding of emergent multi-agent behavior. In Deep Reinforcement Learning Workshop NeurIPS 2022.
  38. Concept Distillation: Leveraging Human-Centered Explanations for Model Improvement. In Thirty-seventh Conference on Neural Information Processing Systems.
  39. Interpreting Intrinsic Image Decomposition Using Concept Activations (ICVGIP ’22). Association for Computing Machinery, New York, NY, USA, Article 2, 9 pages. https://doi.org/10.1145/3571600.3571603
  40. PatchVAE: Learning Local Latent Codes for Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 4745–4754.
  41. Improving Disentangled Representation Learning with the Beta Bernoulli Process. 2019 IEEE International Conference on Data Mining (ICDM) (2019), 1078–1083.
  42. Peter Hase and Mohit Bansal. 2021. When can models learn from explanations? a formal framework for understanding the roles of explanation data. arXiv preprint arXiv:2102.02201 (2021).
  43. Efficient Human-in-the-loop System for Guiding DNNs Attention. arXiv preprint arXiv:2206.05981 (2022).
  44. Carl Johan Helgstrand and Niklas Hultin. 2022. Comparing Human Reasoning and Explainable AI.
  45. P Hitzler and M Sarker. 2022. Human-centered concept explanations for neural networks. Neuro-Symbolic Artificial Intelligence: The State of the Art 342, 337 (2022), 2.
  46. Mapping Knowledge Representations to Concepts: A Review and New Perspectives. ArXiv abs/2301.00189 (2022).
  47. Microsoft concept graph: Mining semantic concepts for short text understanding. Data Intelligence 1, 3 (2019), 238–270.
  48. Visualizing surrogate decision trees of convolutional neural networks. Journal of Visualization 23 (2019), 141–156.
  49. Visualizing surrogate decision trees of convolutional neural networks. Journal of Visualization 23 (11 2019). https://doi.org/10.1007/s12650-019-00607-z
  50. PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining CNNs. 2021 International Joint Conference on Neural Networks (IJCNN) (2021), 1–8.
  51. Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches. ArXiv abs/2104.06917 (2021).
  52. Now you see me (CME): concept-based model extraction. arXiv preprint arXiv:2010.13233 (2020).
  53. Proto2Proto: Can you recognize the car, the way I do?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10233–10243.
  54. Hamed Behzadi Khormuji and José Oramas. 2023. A Protocol for Evaluating Model Interpretation Methods from Visual Explanations. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023), 1421–1429.
  55. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International conference on machine learning. PMLR, 2668–2677.
  56. XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021), 15714–15723.
  57. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
  58. Concept bottleneck models. In International Conference on Machine Learning. PMLR, 5338–5348.
  59. Visual Debates. arXiv preprint arXiv:2210.09015 (2022).
  60. Interpreting deep neural networks for medical imaging using concept graphs. In AI for Disease Surveillance and Pandemic Intelligence: Intelligent Disease Detection in Action. Springer, 201–216.
  61. Jan Kronenberger and Anselm Haselhoff. 2020. Dependency Decomposition and a Reject Option for Explainable Models. arXiv preprint arXiv:2012.06523 (2020).
  62. Too much, too little, or just right? Ways explanations impact end users’ mental models. In 2013 IEEE Symposium on visual languages and human centric computing. IEEE, 3–10.
  63. Isaac Lage and Finale Doshi-Velez. 2020. Learning interpretable concept-based models with human feedback. arXiv preprint arXiv:2012.02898 (2020).
  64. Building machines that learn and think like people. Behavioral and brain sciences 40 (2017), e253.
  65. Explaining in style: Training a gan to explain a classifier in stylespace. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 693–702.
  66. Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning. PMLR, 3744–3753.
  67. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
  68. CNN Attention Guidance for Improved Orthopedics Radiographic Fracture Classification. IEEE Journal of Biomedical and Health Informatics 26, 7 (2022), 3139–3150.
  69. Explainable ai: A review of machine learning interpretability methods. Entropy 23, 1 (2020), 18.
  70. Semi-supervised multitask learning. Advances in Neural Information Processing Systems 20 (2007).
  71. Object-centric learning with slot attention. Advances in Neural Information Processing Systems 33 (2020), 11525–11538.
  72. Towards learning to explain with concept bottleneck models: mitigating information leakage. ArXiv abs/2211.03656 (2022).
  73. Promises and Pitfalls of Black-Box Concept Learning Models. ArXiv abs/2106.13314 (2021).
  74. The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision. ArXiv abs/1904.12584 (2019).
  75. Neuro Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal. arXiv preprint arXiv:2302.01242 (2023).
  76. Neuro Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal. ArXiv abs/2302.01242 (2023).
  77. GlanceNets: Interpretabile, Leak-proof Concept-based Models. ArXiv abs/2205.15612 (2022).
  78. Do Concept Bottleneck Models Learn as Intended? ArXiv abs/2105.04289 (2021).
  79. Acquisition of chess knowledge in AlphaZero. Proceedings of the National Academy of Sciences of the United States of America 119 (2021).
  80. Acquisition of chess knowledge in alphazero. Proceedings of the National Academy of Sciences 119, 47 (2022), e2206625119.
  81. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
  82. Gregory L Murphy. 2002. The big book of concepts. A Bradford Book.
  83. Gayda Mutahar and Tim Miller. 2022. Concept-based Explanations using Non-negative Concept Activation Vectors and Decision Tree for CNN Models. ArXiv abs/2211.10807 (2022).
  84. Generative causal explanations of black-box classifiers. ArXiv abs/2006.13913 (2020).
  85. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.
  86. Right for the right reasons: Training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717 (2017).
  87. ProtoMIL: Multiple Instance Learning with Prototypical Parts for Whole-Slide Image Classification.
  88. ProtoSeg: Interpretable Semantic Segmentation With Prototypical Parts. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1481–1492.
  89. Hierarchical Symbolic Reasoning in Hyperbolic Space for Deep Discriminative Models. arXiv preprint arXiv:2207.01916 (2022).
  90. A framework for learning ante-hoc explainable models via concepts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10286–10295.
  91. Yoshihide Sawada and Keigo Nakamura. 2022a. C-SENN: Contrastive Self-Explaining Neural Network. ArXiv abs/2206.09575 (2022).
  92. Yoshihide Sawada and Keigo Nakamura. 2022b. Concept Bottleneck Model With Additional Unsupervised Concepts. IEEE Access 10 (2022), 41758–41765.
  93. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nature Machine Intelligence 2, 8 (2020), 476–486.
  94. Best of both worlds: local and global explanations with human-understandable concepts. arXiv preprint arXiv:2106.08641 (2021).
  95. Bridging the human-ai knowledge gap: Concept discovery and transfer in alphazero. arXiv preprint arXiv:2310.16410 (2023).
  96. Gesina Schwalbe. 2022. Concept Embedding Analysis: A Review. arXiv preprint arXiv:2203.13909 (2022).
  97. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
  98. Right for the right latent factors: Debiasing generative models via disentanglement. arXiv preprint arXiv:2202.00391 (2022).
  99. Lloyd S Shapley et al. 1953. A value for n-person games. (1953).
  100. Towards Automated Concept-based Decision TreeExplanations for CNNs. In International Conference on Extending Database Technology.
  101. Human-AI interactive and continuous sensemaking: A case study of image classification using scribble attention maps. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–8.
  102. Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE transactions on pattern analysis and machine intelligence 44, 4 (2020), 2004–2018.
  103. Weakly supervised disentanglement with guarantees. arXiv preprint arXiv:1910.09772 (2019).
  104. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017).
  105. Understanding and Enhancing Robustness of Concept-based Models. ArXiv abs/2211.16080 (2022).
  106. Img2Tab: Automatic Class Relevant Concept Discovery from StyleGAN Features for Explainable Image Classification. arXiv preprint arXiv:2301.06324 (2023).
  107. Adversarial TCAV - Robust and Effective Interpretation of Intermediate Layers in Neural Networks. ArXiv abs/2002.03549 (2020).
  108. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural networks 32 (2012), 323–332.
  109. Interactive disentanglement: Learning concepts by interacting with their prototype representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10317–10328.
  110. Right for the right concept: Revising neuro-symbolic concepts by interacting with their explanations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3619–3629.
  111. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319–3328.
  112. Stefano Teso. 2019. Toward faithful explanatory active learning with self-explainable neural nets. In Proceedings of the Workshop on Interactive Adaptive Learning (IAL 2019). CEUR Workshop Proceedings, 4–16.
  113. Leveraging Explanations in Interactive Machine Learning: An Overview. ArXiv abs/2207.14526 (2022).
  114. Stefano Teso and Kristian Kersting. 2019. Explanatory interactive machine learning. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 239–245.
  115. Variational Sparse Coding. In Conference on Uncertainty in Artificial Intelligence.
  116. Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation. In AAAI Conference on Artificial Intelligence.
  117. Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees. ArXiv abs/2301.11911 (2023).
  118. Stanislav Vojíř and Tomáš Kliegr. 2020. Editable machine learning models? A rule-based framework for user studies of explainability. Advances in Data Analysis and Classification 14, 4 (2020), 785–799.
  119. HINT: Hierarchical Neuron Concept Explainer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10254–10264.
  120. Learning Support and Trivial Prototypes for Interpretable Image Classification. arXiv preprint arXiv:2301.04011 (2023).
  121. Chain: Concept-harmonized hierarchical inference interpretation of deep convolutional neural networks. arXiv preprint arXiv:2002.01660 (2020).
  122. Beyond explaining: Opportunities and challenges of XAI-based model improvement. Information Fusion (2022).
  123. Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD international conference on management of data. 481–492.
  124. Unified perceptual parsing for scene understanding. In Proceedings of the European conference on computer vision (ECCV). 418–434.
  125. ProtoPFormer: Concentrating on prototypical parts in vision transformers for interpretable image recognition. arXiv preprint arXiv:2208.10431 (2022).
  126. On completeness-aware concept-based explanations in deep neural networks. Advances in Neural Information Processing Systems 33 (2020), 20554–20565.
  127. On Completeness-aware Concept-Based Explanations in Deep Neural Networks. arXiv: Learning (2019).
  128. Neural-symbolic vqa: Disentangling reasoning from vision and language understanding. Advances in neural information processing systems 31 (2018).
  129. Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480 (2022).
  130. Towards Robust Metrics for Concept Representation Evaluation. ArXiv abs/2301.10367 (2023).
  131. Learning to Receive Help: Intervention-Aware Concept Embedding Models. arXiv preprint arXiv:2309.16928 (2023).
  132. Tabcbm: Concept-based interpretable neural networks for tabular data. Transactions on Machine Learning Research (2023).
  133. Knowledge Via An Explanatory Graph.
  134. Invertible concept-based explanations for cnn models with non-negative concept activation vectors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11682–11690.
  135. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 5 (2021), 726–742.
  136. Interpretable basis decomposition for visual explanation. In Proceedings of the European Conference on Computer Vision (ECCV). 119–134.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Avani Gupta (5 papers)
  2. P J Narayanan (8 papers)
Citations (4)