Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Editable Concept Bottleneck Models (2405.15476v3)

Published 24 May 2024 in cs.LG, cs.AI, and cs.CV

Abstract: Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Second-order stochastic optimization for machine learning in linear time. Journal of Machine Learning Research, 18(116):1–40, 2017.
  2. Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics, pages 559–560, 2018.
  3. Prompt-saw: Leveraging relation-aware graphs for textual prompt compression. arXiv preprint arXiv:2404.00489, 2024.
  4. MS Bartlett. Approximate confidence intervals. Biometrika, 40(1/2):12–19, 1953.
  5. Influence functions in deep learning are fragile. In International Conference on Learning Representations (ICLR), 2021.
  6. Understanding the origins of bias in word embeddings. In International conference on machine learning, pages 803–811. PMLR, 2019.
  7. Longbing Cao. Ai in finance: challenges, techniques, and opportunities. ACM Computing Surveys (CSUR), 55(3):1–38, 2022.
  8. Interactive concept bottleneck models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37(5), pages 5948–5955, 2023.
  9. Multi-stage influence function. Advances in Neural Information Processing Systems, 33:12732–12742, 2020.
  10. Multi-hop question answering under temporal knowledge editing. arXiv preprint arXiv:2404.00492, 2024.
  11. R Dennis Cook. Detection of influential observation in linear regression. Technometrics, 42(1):65–68, 2000.
  12. Characterizations of an empirical influence function for detecting influential cases in regression. Technometrics, 22(4):495–508, 1980.
  13. Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv preprint arXiv:2006.11371, 2020.
  14. Fast approximate natural gradient descent in a kronecker factored eigenbasis. Advances in Neural Information Processing Systems, 31, 2018.
  15. Mixed-privacy forgetting in deep networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 792–801, 2021.
  16. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9304–9312, 2020.
  17. Fastif: Scalable influence functions for efficient model interpretation and debugging. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10333–10350, 2021.
  18. Explaining black box predictions and unveiling data artifacts through influence functions. arXiv preprint arXiv:2005.06676, 2020.
  19. Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022.
  20. Improving faithfulness for vision transformers. arXiv preprint arXiv:2311.17983, 2023.
  21. Seat: stable and explainable attention. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37(11), pages 12907–12915, 2023.
  22. Interpretable model-agnostic plausibility verification for 2d object detectors using domain-invariant concept bottleneck models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3890–3899, 2023.
  23. Probabilistic concept bottleneck models. arXiv preprint arXiv:2306.01574, 2023.
  24. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017.
  25. Concept bottleneck models. In International conference on machine learning, pages 5338–5348. PMLR, 2020.
  26. Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models. In The Twelfth International Conference on Learning Representations, 2023.
  27. Faithful vision-language interpretation via concept bottleneck models. In The Twelfth International Conference on Learning Representations, 2023.
  28. Certified minimax unlearning with generalization rates and deletion capacity. Advances in Neural Information Processing Systems, 36, 2024.
  29. The osteoarthritis initiative. Protocol for the cohort study, 1:2, 2006.
  30. Label-free concept bottleneck models. In International Conference on Learning Representations, 2023.
  31. A model of disparities: risk factors associated with covid-19 infection. International journal for equity in health, 19(1):126, 2020.
  32. Obesity is a risk factor for severe covid-19 infection: multiple potential mechanisms. Circulation, 142(1):4–6, 2020.
  33. Concept bottleneck model with additional unsupervised concepts. IEEE Access, 10:41758–41765, 2022.
  34. Auxiliary losses for learning generalizable concept-based models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  35. The caltech-ucsd birds-200-2011 dataset. California Institute of Technology, 2011.
  36. Repairing without retraining: Avoiding disparate impact with counterfactual distributions. In International Conference on Machine Learning, pages 6618–6627. PMLR, 2019.
  37. Machine unlearning of features and labels. Network and Distributed System Security (NDSS) Symposium, 2023.
  38. An llm can fool itself: A prompt-based adversarial attack. arXiv preprint arXiv:2310.13345, 2023.
  39. Moral: Moe augmented lora for llms’ lifelong learning. arXiv preprint arXiv:2402.11260, 2024.
  40. Human-ai interactions in the communication era: Autophagy makes large models achieving local optima. arXiv preprint arXiv:2402.11271, 2024.
  41. Dialectical alignment: Resolving the tension of 3h and security threats of llms. arXiv preprint arXiv:2404.00486, 2024.
  42. A survey on multimodal large language models. arXiv preprint arXiv:2306.13549, 2023.
  43. Artificial intelligence in healthcare. Nature biomedical engineering, 2(10):719–731, 2018.
  44. Post-hoc concept bottleneck models. In The Eleventh International Conference on Learning Representations, 2023.
  45. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets