Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

COSE: A Consistency-Sensitivity Metric for Saliency on Image Classification (2309.10989v1)

Published 20 Sep 2023 in cs.CV

Abstract: We present a set of metrics that utilize vision priors to effectively assess the performance of saliency methods on image classification tasks. To understand behavior in deep learning models, many methods provide visual saliency maps emphasizing image regions that most contribute to a model prediction. However, there is limited work on analyzing the reliability of saliency methods in explaining model decisions. We propose the metric COnsistency-SEnsitivity (COSE) that quantifies the equivariant and invariant properties of visual model explanations using simple data augmentations. Through our metrics, we show that although saliency methods are thought to be architecture-independent, most methods could better explain transformer-based models over convolutional-based models. In addition, GradCAM was found to outperform other methods in terms of COSE but was shown to have limitations such as lack of variability for fine-grained datasets. The duality between consistency and sensitivity allow the analysis of saliency methods from different angles. Ultimately, we find that it is important to balance these two metrics for a saliency map to faithfully show model behavior.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Sanity checks for saliency maps. In Advances in neural information processing systems, volume 31, 2018.
  2. Debugging tests for model explanations. In Advances in Neural Information Processing Systems, volume 33, pages 700–712, 2020.
  3. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021.
  4. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), pages 839–847. IEEE, 2018.
  5. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9640–9649, 2021.
  6. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021.
  8. How good is your explanation? algorithmic stability measures to assess the quality of explanations for deep neural networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 720–730, 2022.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  10. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15013–15022, 2021.
  11. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. In IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, volume 12, pages 2217–2226. IEEE, 2019.
  12. Xrai: Better attributions through regions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4948–4957, 2019.
  13. Guided integrated gradients: An adaptive path method for removing noise. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5050–5058, 2021.
  14. Learning multiple layers of features from tiny images. 2009.
  15. Caltech 101, Apr 2022.
  16. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  17. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022.
  18. Understanding the effective receptive field in deep convolutional neural networks. Advances in neural information processing systems, 29, 2016.
  19. Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 774–782, 2021.
  20. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729. IEEE, 2008.
  21. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
  22. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
  23. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.
  24. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
  25. Designing bert for convolutional networks: Sparse and hierarchical masked modeling. arXiv:2301.03580, 2023.
  26. Sanity checks for saliency metrics. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 6021–6029, 2020.
  27. Image quality assessment: from error visibility to structural similarity. In IEEE transactions on image processing, volume 13, pages 600–612. IEEE, 2004.
  28. Caltech-ucsd birds 200. 2010.
  29. Attribution in scale and space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9680–9689, 2020.
  30. Benchmarking attribution methods with relative feature importance. arXiv preprint arXiv:1907.09701, 2019.
  31. On the (in) fidelity and sensitivity of explanations. In Advances in Neural Information Processing Systems, volume 32, 2019.
  32. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
  33. Evaluating the quality of machine learning explanations: A survey on methods and metrics. In Electronics, volume 10, page 593. MDPI, 2021.
  34. ibot: Image bert pre-training with online tokenizer. International Conference on Learning Representations (ICLR), 2022.
  35. Do feature attribution methods correctly attribute features? In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 9623–9633, 2022.
  36. How well do feature visualizations support causal understanding of cnn activations? In Advances in Neural Information Processing Systems, volume 34, pages 11730–11744, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Rangel Daroya (4 papers)
  2. Aaron Sun (4 papers)
  3. Subhransu Maji (78 papers)