Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training (2404.04647v1)

Published 6 Apr 2024 in cs.CV

Abstract: Gradient-based saliency maps have been widely used to explain the decisions of deep neural network classifiers. However, standard gradient-based interpretation maps, including the simple gradient and integrated gradient algorithms, often lack desired structures such as sparsity and connectedness in their application to real-world computer vision models. A frequently used approach to inducing sparsity structures into gradient-based saliency maps is to alter the simple gradient scheme using sparsification or norm-based regularization. A drawback with such post-processing methods is their frequently-observed significant loss in fidelity to the original simple gradient map. In this work, we propose to apply adversarial training as an in-processing scheme to train neural networks with structured simple gradient maps. We show a duality relation between the regularized norms of the adversarial perturbations and gradient-based maps, based on which we design adversarial training loss functions promoting sparsity and group-sparsity properties in simple gradient maps. We present several numerical results to show the influence of our proposed norm-based adversarial training methods on the standard gradient-based maps of standard neural network architectures on benchmark image datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
  2. Assessing the trustworthiness of saliency maps for localizing abnormalities in medical imaging. Radiology: Artificial Intelligence, 3(6):e200267, 2021.
  3. Concise explanations of neural networks using adversarial training. In International Conference on Machine Learning, pages 1383–1391. PMLR, 2020.
  4. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  5. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185–9193, 2018.
  6. Harmonizing the object recognition strategies of deep neural networks with humans. Advances in Neural Information Processing Systems, 35:9432–9446, 2022.
  7. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
  8. Interpretation of neural networks is fragile. In Proceedings of the AAAI conference on artificial intelligence, pages 3681–3688, 2019.
  9. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  10. A review of semantic segmentation using deep neural networks. International journal of multimedia information retrieval, 7:87–93, 2018.
  11. On the impact of knowledge distillation for model interpretability. arXiv preprint arXiv:2305.15734, 2023.
  12. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  13. Comparing measures of sparsity. IEEE Transactions on Information Theory, 55(10):4723–4741, 2009.
  14. Why are saliency maps noisy? cause of and solution to noisy saliency maps. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 4149–4157. IEEE, 2019a.
  15. Bridging adversarial robustness and gradient interpretability. arXiv preprint arXiv:1903.11626, 2019b.
  16. Interpretable learning for self-driving cars by visualizing causal attention. In Proceedings of the IEEE international conference on computer vision, pages 2942–2950, 2017.
  17. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  18. Certifiably robust interpretation in deep learning. arXiv preprint arXiv:1905.12105, 2019.
  19. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  20. Feature visualization. Distill, 2(11):e7, 2017.
  21. Human attention in fine-grained classification. arXiv preprint arXiv:2111.01628, 2021.
  22. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the AAAI conference on artificial intelligence, 2018.
  23. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11):2660–2673, 2016.
  24. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
  25. Adversarial training for free! Advances in Neural Information Processing Systems, 32, 2019.
  26. Do input gradients highlight discriminative features? Advances in Neural Information Processing Systems, 34:2046–2059, 2021.
  27. Deep learning in medical image analysis. Annual review of biomedical engineering, 19:221–248, 2017.
  28. Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
  29. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
  30. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.
  31. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.
  32. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
  33. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
  34. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019.
  35. Quantifying explainability of saliency methods in deep neural networks with a synthetic dataset. IEEE Transactions on Artificial Intelligence, 2022.
  36. The caltech-ucsd birds-200-2011 dataset. 2011.
  37. Deep face recognition: A survey. Neurocomputing, 429:215–244, 2021.
  38. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  39. Initialization noise in image gradients and saliency maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1766–1775, 2023.
  40. On the (in) fidelity and sensitivity of explanations. Advances in Neural Information Processing Systems, 32, 2019.
  41. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(1):49–67, 2006.
  42. Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pages 818–833. Springer, 2014.
  43. Moreaugrad: Sparse and robust interpretation of neural networks via moreau envelope. arXiv preprint arXiv:2302.05294, 2023.
  44. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11):3212–3232, 2019.
  45. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2):301–320, 2005.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com