Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models (2405.01531v2)

Published 2 May 2024 in cs.LG, cs.AI, and cs.CV

Abstract: Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions. Crucially, the CBM design inherently allows for human interventions, in which expert users are given the ability to modify potentially misaligned concept choices to influence the decision behavior of the model in an interpretable fashion. However, existing approaches often require numerous human interventions per image to achieve strong performances, posing practical challenges in scenarios where obtaining human feedback is expensive. In this paper, we find that this is noticeably driven by an independent treatment of concepts during intervention, wherein a change of one concept does not influence the use of other ones in the model's final decision. To address this issue, we introduce a trainable concept intervention realignment module, which leverages concept relations to realign concept assignments post-intervention. Across standard, real-world benchmarks, we find that concept realignment can significantly improve intervention efficacy; significantly reducing the number of interventions needed to reach a target classification performance or concept prediction accuracy. In addition, it easily integrates into existing concept-based architectures without requiring changes to the models themselves. This reduced cost of human-model collaboration is crucial to enhancing the feasibility of CBMs in resource-constrained environments. Our code is available at: https://github.com/ExplainableML/concept_realignment.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2623–2631, 2019.
  2. Towards robust interpretability with self-explaining neural networks. Advances in neural information processing systems, 31, 2018.
  3. Detecting shortcut learning for fair medical ai using shortcut testing. Nature Communications, 14(1), July 2023. ISSN 2041-1723. doi: 10.1038/s41467-023-39902-7. URL http://dx.doi.org/10.1038/s41467-023-39902-7.
  4. Analysis of explainers of black box deep neural networks for computer vision: A survey, 2019.
  5. Black-box access is insufficient for rigorous ai audits, 2024.
  6. Interactive concept bottleneck models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 5948–5955, 2023.
  7. Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, 2016.
  8. Is fairness only metric deep? evaluating and addressing subgroup gaps in deep metric learning. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=js62_xuLDDv.
  9. Who is afraid of black box algorithms? on the epistemological and ethical basis of trust in medical ai. Journal of Medical Ethics, 47(5):329–335, 2021. ISSN 0306-6800. doi: 10.1136/medethics-2020-106820. URL https://jme.bmj.com/content/47/5/329.
  10. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, page 214–226, New York, NY, USA, 2012. Association for Computing Machinery. ISBN 9781450311151. doi: 10.1145/2090236.2090255. URL https://doi.org/10.1145/2090236.2090255.
  11. EUGDPR. Gdpr. general data protection regulation, 2017.
  12. Diagvib-6: A diagnostic benchmark suite for vision models in the presence of shortcut and generalization opportunities. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021.
  13. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2:665 – 673, 2020.
  14. Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022.
  15. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  16. Probabilistic concept bottleneck models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 16521–16540. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/kim23g.html.
  17. Concept bottleneck models. In International conference on machine learning, pages 5338–5348. PMLR, 2020.
  18. Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings 1994, pages 148–156. Elsevier, 1994.
  19. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
  20. On the fairness of disentangled representations. Curran Associates Inc., Red Hook, NY, USA, 2019.
  21. Promises and pitfalls of black-box concept learning models. arXiv preprint arXiv:2106.13314, 2021.
  22. Glancenets: Interpretable, leak-proof concept-based models. Advances in Neural Information Processing Systems, 35:21212–21227, 2022.
  23. Do concept bottleneck models learn as intended? arXiv preprint arXiv:2105.04289, 2021.
  24. A survey on bias and fairness in machine learning, 2022.
  25. Label-free concept bottleneck models. In The Eleventh International Conference on Learning Representations, 2022.
  26. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  27. Samuele Lo Piano. Ethical principles in machine learning and artificial intelligence: cases from the field and possible ways forward. Palgrave Communications, 7(1):1–7, 2020. URL https://EconPapers.repec.org/RePEc:pal:palcom:v:7:y:2020:i:1:d:10.1057_s41599-020-0501-9.
  28. Disentanglement of correlated factors via hausdorff factorized support. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=OKcJhpQiGiX.
  29. Fantastic gains and where to find them: On the existence and prospect of general knowledge transfer between any pretrained model. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=m50eKHCttz.
  30. Concept bottleneck model with additional unsupervised concepts. IEEE Access, 10:41758–41765, 2022.
  31. Learning from uncertain concepts via test time interventions. In Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022, 2022.
  32. A closer look at the intervention procedure of concept bottleneck models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 31504–31520. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/shin23a.html.
  33. Opening the black box of deep neural networks via information, 2017.
  34. Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018.
  35. The caltech-ucsd birds-200-2011 dataset. 2011.
  36. Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9):2251–2265, 2018.
  37. Energy-based concept bottleneck models. In The Twelfth International Conference on Learning Representations, 2023.
  38. Language in a bottle: Language model guided concept bottlenecks for interpretable image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19187–19197, 2023.
  39. Post-hoc concept bottleneck models. In The Eleventh International Conference on Learning Representations, 2022.
  40. Concept embedding models. arXiv preprint arXiv:2209.09056, 2022.
  41. Towards robust metrics for concept representation evaluation. arXiv preprint arXiv:2301.10367, 2023a.
  42. Learning to receive help: Intervention-aware concept embedding models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Nishad Singhi (6 papers)
  2. Jae Myung Kim (14 papers)
  3. Karsten Roth (36 papers)
  4. Zeynep Akata (144 papers)
Citations (1)