Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat (2307.05350v2)

Published 7 Jul 2023 in cs.LG, cs.CV, and cs.CY

Abstract: ML model design either starts with an interpretable model or a Blackbox and explains it post hoc. Blackbox models are flexible but difficult to explain, while interpretable models are inherently explainable. Yet, interpretable models require extensive ML knowledge and tend to be less flexible and underperforming than their Blackbox variants. This paper aims to blur the distinction between a post hoc explanation of a Blackbox and constructing interpretable models. Beginning with a Blackbox, we iteratively carve out a mixture of interpretable experts (MoIE) and a residual network. Each interpretable model specializes in a subset of samples and explains them using First Order Logic (FOL), providing basic reasoning on concepts from the Blackbox. We route the remaining samples through a flexible residual. We repeat the method on the residual network until all the interpretable models explain the desired proportion of data. Our extensive experiments show that our route, interpret, and repeat approach (1) identifies a diverse set of instance-specific concepts with high concept completeness via MoIE without compromising in performance, (2) identifies the relatively ``harder'' samples to explain via residuals, (3) outperforms the interpretable by-design models by significant margins during test-time interventions, and (4) fixes the shortcut learned by the original Blackbox. The code for MoIE is publicly available at: \url{https://github.com/batmanlab/ICML-2023-Route-interpret-repeat}

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Meaningfully explaining model mistakes using conceptual counterfactuals. arXiv preprint arXiv:2106.12723, 2021.
  2. Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
  3. Learning interpretation with explainable knowledge distillation. In 2021 IEEE International Conference on Big Data (Big Data), pp.  705–714. IEEE, 2021.
  4. Entropy-based logic explanations of neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  6046–6054, 2022.
  5. Belle, V. Symbolic logic meets machine learning: A brief survey in infinite domains. In International Conference on Scalable Uncertainty Management, pp.  3–16. Springer, 2020.
  6. Neural-symbolic learning and reasoning: A survey and interpretation. arXiv preprint arXiv:1711.03902, 2017.
  7. Layer-wise relevance propagation for neural networks with local renormalization layers. In International Conference on Artificial Neural Networks, pp.  63–71. Springer, 2016.
  8. Classification and regression trees (crc, boca raton, fl). 1984.
  9. Explaining knowledge distillation by quantifying the knowledge. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  12925–12935, 2020.
  10. Logic explained networks. arXiv preprint arXiv:2108.05149, 2021.
  11. Logic explained networks. Artificial Intelligence, 314:103822, 2023.
  12. Disparities in dermatology ai: Assessments using diverse clinical images. arXiv preprint arXiv:2111.08006, 2021.
  13. Neural-symbolic learning and reasoning: contributions and challenges. In 2015 AAAI Spring Symposium Series, 2015.
  14. Selectivenet: A deep neural network with an integrated reject option. In International conference on machine learning, pp. 2151–2159. PMLR, 2019.
  15. Generalized additive models: some applications. Journal of the American Statistical Association, 82(398):371–386, 1987.
  16. Addressing leakage in concept bottleneck models. In Advances in Neural Information Processing Systems, 2022.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  18. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2(7), 2015.
  19. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  4700–4708, 2017.
  20. Radgraph: Extracting clinical entities and relations from radiology reports. arXiv preprint arXiv:2106.14463, 2021.
  21. Mimic-cxr-jpg-chest radiographs with structured labels.
  22. Seven-point checklist and skin lesion classification using multitask multimodal neural nets. IEEE journal of biomedical and health informatics, 23(2):538–546, 2018.
  23. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav).(2017). arXiv preprint arXiv:1711.11279, 2017.
  24. Concept bottleneck models. In International Conference on Machine Learning, pp. 5338–5348. PMLR, 2020.
  25. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3):1350–1371, 2015.
  26. Metadata normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10917–10927, 2021.
  27. On interpretability of deep learning based skin lesion classifiers using concept activation vectors. In 2020 international joint conference on neural networks (IJCNN), pp.  1–10. IEEE, 2020.
  28. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems, pp.  4768–4777, 2017.
  29. Mendelson, E. Introduction to mathematical logic. Chapman and Hall/CRC, 2009.
  30. Methods for interpreting and understanding deep neural networks. Digital signal processing, 73:1–15, 2018.
  31. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.  1135–1144, 2016.
  32. Patch shortcuts: Interpretable proxy models efficiently find black-box vulnerabilities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  56–65, 2021.
  33. A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific data, 8(1):1–8, 2021.
  34. Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
  35. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731, 2019.
  36. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11):2660–2673, 2016.
  37. A framework for learning ante-hoc explainable models via concepts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10286–10295, 2022.
  38. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp.  618–626, 2017.
  39. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
  40. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
  41. Explanation by progressive exaggeration. arXiv preprint arXiv:1911.00483, 2019.
  42. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.
  43. Axiomatic attribution for deep networks. In International conference on machine learning, pp. 3319–3328. PMLR, 2017.
  44. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1–9, 2015.
  45. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018.
  46. Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.  5784–5789, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1585. URL https://aclanthology.org/D19-1585.
  47. The caltech-ucsd birds-200-2011 dataset. 2011.
  48. Explainability’s gain is optimality’s loss? how explanations bias decision-making. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pp.  778–787, 2022.
  49. Feature fusion vision transformer for fine-grained visual categorization. arXiv preprint arXiv:2107.02341, 2021.
  50. Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9):2251–2265, 2018.
  51. On concept-based explanations in deep neural networks. 2019.
  52. Anatomy-guided weakly-supervised abnormality localization in chest x-rays. arXiv preprint arXiv:2206.12704, 2022.
  53. Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480, 2022.
  54. Concept embedding models. arXiv preprint arXiv:2209.09056, 2022.
Citations (10)

Summary

We haven't generated a summary for this paper yet.