FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (2308.06248v1)
Abstract: The field of explainable artificial intelligence (XAI) aims to uncover the inner workings of complex deep neural models. While being crucial for safety-critical domains, XAI inherently lacks ground-truth explanations, making its automatic evaluation an unsolved problem. We address this challenge by proposing a novel synthetic vision dataset, named FunnyBirds, and accompanying automatic evaluation protocols. Our dataset allows performing semantically meaningful image interventions, e.g., removing individual object parts, which has three important implications. First, it enables analyzing explanations on a part level, which is closer to human comprehension than existing methods that evaluate on a pixel level. Second, by comparing the model output for inputs with removed parts, we can estimate ground-truth part importances that should be reflected in the explanations. Third, by mapping individual explanations into a common space of part importances, we can analyze a variety of different explanation types in a single common framework. Using our tools, we report results for 24 different combinations of neural models and XAI methods, demonstrating the strengths and weaknesses of the assessed methods in a fully automatic and systematic manner.
- Quantifying attention flow in transformers. In ACL, pages 4190–4197, 2020.
- Debugging tests for model explanations. In NeurIPS*2020, pages 700–712.
- Towards robust interpretability with self-explaining neural networks. In NeurIPS*2018, pages 7786–7795.
- Towards better understanding of gradient-based attribution methods for deep neural networks. In ICLR, 2018.
- Hassan Aziz. Comparison between field research and controlled laboratory research. Archives of Clinical and Biomedical Research, 1:101–104, 04 2017.
- Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet. In ICLR, 2019.
- A naturalistic open source movie for optical flow evaluation. In ECCV, volume 4, pages 611–625. 2012.
- Convolutional dynamic alignment networks for interpretable classifications. In CVPR, pages 10029–10038, 2021.
- B-cos networks: Alignment is all we need for interpretability. In CVPR, pages 10319–10328, 2022.
- Explaining image classifiers by counterfactual generation. In ICLR, 2019.
- Transformer interpretability beyond attention visualization. In CVPR, pages 782–791, 2021.
- This looks like that: Deep learning for interpretable image recognition. In NeurIPS*2019, pages 8928–8939.
- Learning to explain: An information-theoretic perspective on model interpretation. In ICML, pages 882–891, 2018.
- Explaining neural networks semantically and quantitatively. In ICCV, pages 9186–9195, 2019.
- What I cannot predict, I do not understand: A human-centered evaluation framework for explainability methods. arXiv:2112.04417 [cs.CV], 2021.
- Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In NeurIPS*2018, pages 590–601.
- An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
- Flownet: Learning optical flow with convolutional networks. In ICCV, pages 2758–2766, 2017.
- Towards explanation of DNN-based prediction with guided feature inversion. In SIGKDD, pages 1358–1367, 2018.
- Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence, 3(7):620–631, 2021.
- Interpretable explanations of black boxes by meaningful perturbation. In ICCV, pages 3449–3457, 2017.
- Counterfactual visual explanations. In ICML, pages 2376–2384, 2019.
- The out-of-distribution problem in explainability and search methods for feature importance explanations. In NeurIPS*2021, pages 3650–3666.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Quantus: An explainable AI toolkit for responsible evaluation of neural network explanations. arXiv:2202.06861 [cs.LG], 2022.
- Generating visual explanations. In ECCV, volume 9908, pages 3–19, 2016.
- Fast axiomatic attribution for neural networks. In NeurIPS*2021, pages 19513–19524.
- A benchmark for interpretability methods in deep neural networks. In NeurIPS*2019, pages 9734–9745.
- CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In CVPR, pages 1988–1997, 2017.
- Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In ICML, pages 2673–2682, 2018.
- HIVE: Evaluating the human interpretability of visual explanations. In ECCV, volume 13672, pages 280–298, 2022.
- Donald E. Knuth. Two notes on notation. The American Mathematical Monthly, 99(5):403–422, 1992.
- Understanding neural networks through representation erasure. arXiv:1612.08220 [cs.CL], 2016.
- Visual explanation by interpretation: Improving visual feedback capabilities of deep neural networks. In ICLR, 2019.
- Concepts. In The Stanford Encyclopedia of Philosophy. 2022.
- The effects of demand characteristics on research participant behaviours in non-laboratory settings: A systematic review. PLOS ONE, 7(6):1–6, 06 2012.
- Chapter 3 - Categorization. Cognitive Science, pages 99–143, 1999.
- From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable AI. arXiv:2201.08164 [cs.AI], 2022.
- Neural prototype trees for interpretable fine-grained image recognition. In CVPR, pages 14933–14943, 2021.
- The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. In NeurIPS*2021, pages 26422–26436.
- Generative causal explanations of black-box classifiers. In NeurIPS*2020, pages 5453–5467.
- Robust change captioning. In ICCV, pages 4623–4632, 2019.
- Elements of causal inference: Foundations and learning algorithms. The MIT Press, 2017.
- RISE: randomized input sampling for explanation of black-box models. In BMVC, page 151, 2018.
- Explainability methods for graph convolutional neural networks. In CVPR, pages 10772–10781, 2019.
- Towards better understanding attribution methods. In CVPR, pages 10213–10222, 2022.
- ”Why should I trust you?”: Explaining the predictions of any classifier. In ACM SIGKDD, pages 1135–1144, 2016.
- Playing for data: Ground truth from computer games. In ECCV, volume 1, pages 102–118, 2016.
- Concepts and Categories: Memory, Meaning, and Metaphysics. In The Oxford Handbook of Thinking and Reasoning. Oxford University Press, 2012.
- Right for the right reasons: Training differentiable models by constraining their explanations. In IJCAI, pages 2662–2670, 2017.
- ImageNet large scale visual recognition challenge. Int. J. Comput. Vision, 115(13):211–252, 2015.
- Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Networks Learn. Syst., 28(11):2660–2673, 2017.
- Grad-CAM: Visual explanations from deep networks via gradient-based localization. In ICCV, pages 618–626, 2017.
- How useful are the machine-generated interpretations to general users? A human evaluation on guessing the incorrectly predicted labels. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 8(1):168–172, Oct. 2020.
- Learning important features through propagating activation differences. In ICML, volume 70, pages 3145–3153, 2017.
- Not just a black box: Learning important features through propagating activation differences. arXiv:1605.01713 [cs.LG], 2016.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. In ICLR, 2014.
- Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
- Axiomatic attribution for deep networks. In ICML, pages 3319–3328, 2017.
- Interpretable and fine-grained visual explanations for convolutional neural networks. In CVPR, pages 9097–9107, 2019.
- The Caltech-UCSD Birds-200-2011 dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.
- Re-calibrating feature attributions for model interpretation. In ICLR, 2023.
- Visualizing and understanding convolutional networks. In ECCV, volume 1, pages 818–833, 2014.
- Categorization, concept learning, and behavior analysis: An introduction. Journal of the Experimental Analysis of Behavior, 78:237–48, 12 2002.
- Top-down neural attention by excitation backprop. In ECCV, volume 1, pages 543–559, 2016.
- Interpretable convolutional neural networks. In CVPR, pages 8827–8836, 2018.
- Interpreting CNNs via decision trees. In CVPR, pages 6261–6270, 2019.
- Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods, 12(10):931–934, 2015.
- Visualizing deep neural network decisions: Prediction difference analysis. In ICLR, 2017.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.