Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks (2307.02828v1)
Abstract: Deep neural networks are known to be vulnerable to adversarial examples crafted by adding human-imperceptible perturbations to the benign input. After achieving nearly 100% attack success rates in white-box setting, more focus is shifted to black-box attacks, of which the transferability of adversarial examples has gained significant attention. In either case, the common gradient-based methods generally use the sign function to generate perturbations on the gradient update, that offers a roughly correct direction and has gained great success. But little work pays attention to its possible limitation. In this work, we observe that the deviation between the original gradient and the generated noise may lead to inaccurate gradient update estimation and suboptimal solutions for adversarial transferability. To this end, we propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM). Specifically, we use data rescaling to substitute the sign function without extra computational cost. We further propose a Depth First Sampling method to eliminate the fluctuation of rescaling and stabilize the gradient update. Our method could be used in any gradient-based attacks and is extensible to be integrated with various input transformation or ensemble methods to further improve the adversarial transferability. Extensive experiments on the standard ImageNet dataset show that our method could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in 3rd International Conference on Learning Representations, ICLR, 2015.
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in 2nd International Conference on Learning Representations, ICLR, 2014.
- S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial perturbations,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 1765–1773.
- N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against machine learning,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, 2017, pp. 506–519.
- Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transferable adversarial examples and black-box attacks,” in 5th International Conference on Learning Representations, ICLR, 2017.
- K. Xu, G. Zhang, S. Liu, Q. Fan, M. Sun, H. Chen, P.-Y. Chen, Y. Wang, and X. Lin, “Adversarial t-shirt! evading person detectors in a physical world,” in 16th European Conference on Computer Vision, ECCV, 2020, pp. 665–681.
- M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 1528–1540.
- F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” in 6th International Conference on Learning Representations, ICLR, 2018.
- A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in 5th International Conference on Learning Representations, ICLR, 2017.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in 6th International Conference on Learning Representations, ICLR, 2018.
- Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2018.
- J. Lin, C. Song, K. He, L. Wang, and J. E. Hopcroft, “Nesterov accelerated gradient and scale invariance for adversarial attacks,” in 8th International Conference on Learning Representations, ICLR, 2020.
- L. Wu and Z. Zhu, “Towards understanding and improving the transferability of adversarial examples in deep neural networks,” in Proceedings of The 12th Asian Conference on Machine Learning, vol. 129, 2020, pp. 837–850.
- C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille, “Improving transferability of adversarial examples with input diversity,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 2730–2739.
- Y. Dong, T. Pang, H. Su, and J. Zhu, “Evading defenses to transferable adversarial examples by translation-invariant attacks,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 4312–4321.
- Y. Xiong, J. Lin, M. Zhang, J. E. Hopcroft, and K. He, “Stochastic variance reduced ensemble adversarial attack for boosting the adversarial transferability,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 14 983–14 992.
- J. Hang, K. Han, H. Chen, and Y. Li, “Ensemble adversarial black-box attacks against deep learning systems,” Pattern Recognition, vol. 101, p. 107184, 2020.
- X. Wang and K. He, “Enhancing the transferability of adversarial attacks through variance tuning,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2021, pp. 1924–1933.
- X. Wang, X. He, J. Wang, and K. He, “Admix: Enhancing the transferability of adversarial attacks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV, 2021, pp. 16 158–16 167.
- Z. Che, A. Borji, G. Zhai, S. Ling, J. Li, and P. Le Callet, “A new ensemble adversarial attack powered by long-term gradient memories,” in Proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI, vol. 34, 2020, pp. 3405–3413.
- Z. Hu, H. Li, L. Yuan, Z. Cheng, W. Yuan, and M. Zhu, “Model scheduling and sample selection for ensemble adversarial example attacks,” Pattern Recognition, p. 108824, 2022.
- W. Peng, R. Liu, R. Wang, T. Cheng, Z. Wu, L. Cai, and W. Zhou, “Ensemblefool: A method to generate adversarial examples based on model fusion strategy,” Computers & Security, vol. 107, p. 102317, 2021.
- L. Gao, Q. Zhang, X. Zhu, J. Song, and H. T. Shen, “Staircase sign method for boosting adversarial attacks,” arXiv preprint arXiv:2104.09722, 2021.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015.
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 2818–2826.
- C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI, 2017.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 770–778.
- F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 1778–1787.
- C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille, “Mitigating adversarial effects through randomization,” in 6th International Conference on Learning Representations, ICLR, 2018.
- W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” in 25th Annual Network and Distributed System Security Symposium, NDSS, 2018.
- C. Guo, M. Rana, M. Cisse, and L. van der Maaten, “Countering adversarial images using input transformations,” in 6th International Conference on Learning Representations, ICLR, 2018.
- Z. Liu, Q. Liu, T. Liu, N. Xu, X. Lin, Y. Wang, and W. Wen, “Feature distillation: Dnn-oriented jpeg compression against adversarial examples,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 860–868.
- X. Jia, X. Wei, X. Cao, and H. Foroosh, “Comdefend: An efficient image compression model to defend adversarial examples,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 6084–6092.
- J. Cohen, E. Rosenfeld, and Z. Kolter, “Certified adversarial robustness via randomized smoothing,” in Proceedings of the 36th International Conference on Machine Learning, ICML, 2019, pp. 1310–1320.
- M. Naseer, S. Khan, M. Hayat, F. S. Khan, and F. Porikli, “A self-supervised approach for adversarial robustness,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2020, pp. 262–271.