Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification (2311.17876v1)

Published 29 Nov 2023 in cs.CV

Abstract: Deep neural networks, while powerful for image classification, often operate as "black boxes," complicating the understanding of their decision-making processes. Various explanation methods, particularly those generating saliency maps, aim to address this challenge. However, the inconsistency issues of faithfulness metrics hinder reliable benchmarking of explanation methods. This paper employs an approach inspired by psychometrics, utilizing Krippendorf's alpha to quantify the benchmark reliability of post-hoc methods in image classification. The study proposes model training modifications, including feeding perturbed samples and employing focal loss, to enhance robustness and calibration. Empirical evaluations demonstrate significant improvements in benchmark reliability across metrics, datasets, and post-hoc methods. This pioneering work establishes a foundation for more reliable evaluation practices in the realm of post-hoc explanation methods, emphasizing the importance of model robustness in the assessment process.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. Grad-cam++  Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 839–847, 2018.
  2. Jacob Gildenblat and contributors. Pytorch library for cam methods. https://github.com/jacobgil/pytorch-grad-cam, 2021.
  3. A time-lapse embryo dataset for morphokinetic parameter prediction. Data in Brief, 42:108258, 2022.
  4. Metrics for saliency map evaluation of deep learning explanation methods. In Mounîm El Yacoubi, Eric Granger, Pong Chi Yuen, Umapada Pal, and Nicole Vincent, editors, Pattern Recognition and Artificial Intelligence, pages 84–95, Cham, 2022. Springer International Publishing.
  5. Computing and evaluating saliency maps for image classification: a tutorial. Journal of Electronic Imaging, 32(02), March 2023.
  6. Python implementation of krippendorff’s alpha – inter-rater reliability. https://github.com/grrrr/krippendorff-alpha/, 2023.
  7. Towards better explanations of class activation mapping. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1336–1344, October 2021.
  8. Captum: A unified and generic model interpretability library for pytorch, 2020.
  9. 3d object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013.
  10. Klaus Krippendorff. Content analysis: An introduction to its methodology. SAGE Publications, Inc., 2455 Teller Road, Thousand Oaks California 91320, 2019.
  11. Fine-grained visual classification of aircraft. Technical report, 2013.
  12. Calibrating deep neural networks using focal loss. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Red Hook, NY, USA, 2020. Curran Associates Inc.
  13. Measuring calibration in deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019.
  14. Rise  Randomized input sampling for explanation of black-box models, 2018.
  15. Sanity checks for saliency metrics. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6021–6029, Apr. 2020.
  16. CrohnIPI: An endoscopic image database for the evaluation of automatic Crohn’s disease lesions recognition algorithms. In SPIE Medical Imaging, volume 11317 of Proc. SPIE, Medical Imaging 2020: Biomedical Applications in Molecular, Structural, and Functional Imaging, page 61, Houston, France, February 2020. SPIE.
  17. Generalizing to unseen domains via adversarial data augmentation. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 5339–5349, Red Hook, NY, USA, 2018. Curran Associates Inc.
  18. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.
  19. Improving out-of-distribution generalization by adversarial training with structured priors. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27140–27152. Curran Associates, Inc., 2022.
  20. Improved ood generalization via adversarial training and pretraing. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 11987–11997. PMLR, 18–24 Jul 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Tristan Gomez (5 papers)
  2. Harold Mouchère (11 papers)

Summary

We haven't generated a summary for this paper yet.