CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation (2404.02388v2)
Abstract: Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability. Class activation maps (CAMs) and recent variants provide ways to visually explain the DNN decision-making process by displaying 'attention' heatmaps of the DNNs. Nevertheless, the CAM explanation only offers relative attention information, that is, on an attention heatmap, we can interpret which image region is more or less important than the others. However, these regions cannot be meaningfully compared across classes, and the contribution of each region to the model's class prediction is not revealed. To address these challenges that ultimately lead to better DNN Interpretation, in this paper, we propose CAPE, a novel reformulation of CAM that provides a unified and probabilistically meaningful assessment of the contributions of image regions. We quantitatively and qualitatively compare CAPE with state-of-the-art CAM methods on CUB and ImageNet benchmark datasets to demonstrate enhanced interpretability. We also test on a cytology imaging dataset depicting a challenging Chronic Myelomonocytic Leukemia (CMML) diagnosis problem. Code is available at: https://github.com/AIML-MED/CAPE.
- Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1721–1730, 2015.
- Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), pages 839–847. IEEE, 2018.
- Extracting class activation maps from non-discriminative features as well. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3135–3144, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE international conference on computer vision, pages 3429–3437, 2017.
- Lip: Local importance-based pooling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3355–3364, 2019.
- Deep residual learning for image recognition. In IEEE Conf. on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- Distilling the knowledge in a neural network. NIPS 2014 Deep Learning Workshop, 2015.
- Layercam: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing, 30:5875–5888, 2021.
- Learning to predict where humans look. In IEEE Intl. Conf. on Computer Vision, pages 2106–2113. IEEE, 2009.
- Towards better explanations of class activation mapping. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1336–1344, 2021.
- Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In European Conf. on Computer Vision, pages 695–711. Springer, 2016.
- Fd-cam: Improving faithfulness and discriminability of visual explanation for cnns. In 2022 26th International Conference on Pattern Recognition (ICPR), pages 1300–1306. IEEE, 2022.
- Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12009–12019, 2022.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
- Smooth grad-cam++: An enhanced inference level visualization technique for deep convolutional neural network models. In Intelligent Systems Conference, 2019.
- Sparse linear concept discovery models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2767–2771, 2023.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
- Revisiting the evaluation of class activation mapping for explainability: A novel metric and experimental analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2299–2304, 2021.
- " why should i trust you?" explaining the predictions of any classifier. In ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 1135–1144, 2016.
- Dynamic routing between capsules. Advances in neural information processing systems, 30, 2017.
- Neil Savage. Breaking into the black box of artificial intelligence. Nature, 2022.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In IEEE Intl. Conf. on Computer Vision, pages 618–626, 2017.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. In Intl. Conf. on Learning Representations Workshop, 2014.
- Explainable artificial intelligence (xai) in deep learning-based medical image analysis. Medical Image Analysis, page 102470, 2022.
- Attention is all you need. In Advances in Neural Information Processing Systems, 2017.
- The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001, 2011.
- Score-cam: Score-weighted visual explanations for convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 24–25, 2020.
- Post-hoc concept bottleneck models. In The Eleventh International Conference on Learning Representations, 2023.
- Learning deep features for discriminative localization. In IEEE Conf. on Computer Vision and Pattern Recognition, pages 2921–2929, 2016.