Targeted Background Removal Creates Interpretable Feature Visualizations (2306.13178v1)
Abstract: Feature visualization is used to visualize learned features for black box machine learning models. Our approach explores an altered training process to improve interpretability of the visualizations. We argue that by using background removal techniques as a form of robust training, a network is forced to learn more human recognizable features, namely, by focusing on the main object of interest without any distractions from the background. Four different training methods were used to verify this hypothesis. The first used unmodified pictures. The second used a black background. The third utilized Gaussian noise as the background. The fourth approach employed a mix of background removed images and unmodified images. The feature visualization results show that the background removed images reveal a significant improvement over the baseline model. These new results displayed easily recognizable features from their respective classes, unlike the model trained on unmodified data.
- C. Olah, A. Mordvintsev, and L. Schubert, “Feature visualization,” Distill, vol. 2, no. 11, p. e7, 2017.
- A. Nguyen, J. Yosinski, and J. Clune, “Understanding neural networks via feature visualization: A survey,” in Explainable AI: interpreting, explaining and visualizing deep learning. Springer, 2019, pp. 55–76.
- D. Minh, H. X. Wang, Y. F. Li, and T. N. Nguyen, “Explainable artificial intelligence: a comprehensive review,” Artificial Intelligence Review, pp. 1–66, 2022.
- I. E. Nielsen, D. Dera, G. Rasool, R. P. Ramachandran, and N. C. Bouaynaya, “Robust Explainability: A tutorial on gradient-based attribution methods for deep neural networks,” IEEE Signal Processing Magazine, vol. 39, no. 4, pp. 73–84, 2022.
- A. Das and P. Rad, “Opportunities and challenges in explainable artificial intelligence (xai): A survey,” arXiv preprint arXiv:2006.11371, 2020.
- A. S. Madhav and A. K. Tyagi, “Explainable artificial intelligence (xai): connecting artificial decision-making and human trust in autonomous vehicles,” in Proceedings of Third International Conference on Computing, Communications, and Cyber-Security: IC4S 2021. Springer, 2022, pp. 123–136.
- A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” Advances in neural information processing systems, vol. 32, 2019.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results,” http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
- Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7, no. 7, p. 3, 2015.
- D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry, “Robustness may be at odds with accuracy,” in International Conference on Learning Representations, no. 2019, 2019.
- D. Alvarez-Melis and T. S. Jaakkola, “On the robustness of interpretability methods,” arXiv preprint arXiv:1806.08049, 2018.
- N. Bansal, C. Agarwal, and A. Nguyen, “Sam: The sensitivity of attribution methods to hyperparameters,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2020, pp. 8673–8683.
- L. Engstrom, A. Ilyas, S. Santurkar, D. Tsipras, B. Tran, and A. Madry, “Adversarial robustness as a prior for learned representations,” arXiv preprint arXiv:1906.00945, 2019.
- D. Ye, C. Chen, C. Liu, H. Wang, and S. Jiang, “Detection defense against adversarial attacks with saliency map,” International Journal of Intelligent Systems, 2020.
- G. Novakovsky, N. Dexter, M. W. Libbrecht, W. W. Wasserman, and S. Mostafavi, “Obtaining genetics insights from deep learning via explainable artificial intelligence,” Nature Reviews Genetics, vol. 24, no. 2, pp. 125–137, 2023.
- K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” in In Workshop at International Conference on Learning Representations. Citeseer, 2014.
- A. Nguyen, J. Yosinski, and J. Clune, “Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks,” arXiv preprint arXiv:1602.03616, 2016.
- A. Mahendran and A. Vedaldi, “Visualizing deep convolutional neural networks using natural pre-images,” International Journal of Computer Vision, vol. 120, no. 3, pp. 233–255, 2016.
- D. Wei, B. Zhou, A. Torrabla, and W. Freeman, “Understanding intra-class knowledge inside cnn,” arXiv preprint arXiv:1507.02379, 2015.
- J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, “Understanding neural networks through deep visualization,” arXiv preprint arXiv:1506.06579, 2015.