Advancing Adversarial Training by Injecting Booster Signal (2306.15451v1)
Abstract: Recent works have demonstrated that deep neural networks (DNNs) are highly vulnerable to adversarial attacks. To defend against adversarial attacks, many defense strategies have been proposed, among which adversarial training has been demonstrated to be the most effective strategy. However, it has been known that adversarial training sometimes hurts natural accuracy. Then, many works focus on optimizing model parameters to handle the problem. Different from the previous approaches, in this paper, we propose a new approach to improve the adversarial robustness by using an external signal rather than model parameters. In the proposed method, a well-optimized universal external signal called a booster signal is injected into the outside of the image which does not overlap with the original content. Then, it boosts both adversarial robustness and natural accuracy. The booster signal is optimized in parallel to model parameters step by step collaboratively. Experimental results show that the booster signal can improve both the natural and robust accuracies over the recent state-of-the-art adversarial training methods. Also, optimizing the booster signal is general and flexible enough to be adopted on any existing adversarial training methods.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- H. J. Lee, J. U. Kim, S. Lee, H. G. Kim, and Y. M. Ro, “Structure boundary preserving segmentation for medical image with ambiguous boundary,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 4817–4826.
- T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125.
- J. U. Kim, J. Kwon, H. G. Kim, and Y. M. Ro, “Bbc net: Bounding-box critic network for occlusion-robust object detection,” IEEE transactions on circuits and systems for video technology, vol. 30, no. 4, pp. 1037–1050, 2019.
- N. Moritz, T. Hori, and J. Le, “Streaming automatic speech recognition with the transformer model,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 6074–6078.
- M. Kim, J. Hong, S. J. Park, and Y. M. Ro, “Cromm-vsr: Cross-modal memory augmented visual speech recognition,” IEEE Transactions on Multimedia, 2021.
- M. Kim, J. Hong, and Y. M. Ro, “Lip to speech synthesis with visual context attentional gan,” in Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
- L. Dong, S. Xu, and B. Xu, “Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018, pp. 5884–5888.
- H. Zhang and J. Zhang, “Text graph transformer for document classification,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
- X. Ma, P. Zhang, S. Zhang, N. Duan, Y. Hou, M. Zhou, and D. Song, “A tensorized transformer for language modeling,” Advances in Neural Information Processing Systems, vol. 32, pp. 2232–2242, 2019.
- I. Tenney, D. Das, and E. Pavlick, “Bert rediscovers the classical nlp pipeline,” arXiv preprint arXiv:1905.05950, 2019.
- X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial examples: Attacks and defenses for deep learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 9, pp. 2805–2824, 2019.
- N. Akhtar, A. Mian, N. Kardan, and M. Shah, “Advances in adversarial attacks and defenses in computer vision: A survey,” IEEE Access, vol. 9, pp. 155 161–155 196, 2021.
- Z. Zhang, Z. Zhang, Y. Zhou, L. Wu, S. Wu, X. Han, D. Dou, T. Che, and D. Yan, “Adversarial attack against cross-lingual knowledge graph alignment,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 5320–5337.
- J. Liu, N. Akhtar, and A. Mian, “Adversarial attack on skeleton-based human action recognition,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2020.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
- N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in 2017 ieee symposium on security and privacy (sp). IEEE, 2017, pp. 39–57.
- N. Das, M. Shanbhogue, S.-T. Chen, F. Hohman, L. Chen, M. E. Kounavis, and D. H. Chau, “Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression,” arXiv preprint arXiv:1705.02900, 2017.
- Z. Liu, Q. Liu, T. Liu, N. Xu, X. Lin, Y. Wang, and W. Wen, “Feature distillation: Dnn-oriented jpeg compression against adversarial examples,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019, pp. 860–868.
- M. Naseer, S. Khan, M. Hayat, F. S. Khan, and F. Porikli, “A self-supervised approach for adversarial robustness,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 262–271.
- C. Mao, M. Chiquier, H. Wang, J. Yang, and C. Vondrick, “Adversarial attacks are reversible with natural supervision,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 661–671.
- M. Naseer, S. Khan, and F. Porikli, “Local gradients smoothing: Defense against localized adversarial attacks,” in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019, pp. 1300–1307.
- E. Raff, J. Sylvester, S. Forsyth, and M. McLean, “Barrage of random transforms for adversarially robust defense,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6528–6537.
- C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille, “Mitigating adversarial effects through randomization,” arXiv preprint arXiv:1711.01991, 2017.
- H. Lee, H. J. Lee, S. T. Kim, and Y. M. Ro, “Robust ensemble model training via random layer sampling against adversarial attack,” arXiv preprint arXiv:2005.10757, 2020.
- X. Liu, M. Cheng, H. Zhang, and C.-J. Hsieh, “Towards robust neural networks via random self-ensemble,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 369–385.
- A. Athalye, N. Carlini, and D. Wagner, “Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples,” in International conference on machine learning. PMLR, 2018, pp. 274–283.
- N. Akhtar, J. Liu, and A. Mian, “Defense against universal adversarial perturbations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- X. Zhao, Z. Zhang, Z. Zhang, L. Wu, J. Jin, Y. Zhou, R. Jin, D. Dou, and D. Yan, “Expressive 1-lipschitz neural networks for robust multiple graph learning against adversarial attacks,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 12 719–12 735. [Online]. Available: https://proceedings.mlr.press/v139/zhao21e.html
- V. Srinivasan, C. Rohrer, A. Marban, K.-R. Müller, W. Samek, and S. Nakajima, “Robustifying models against adversarial attacks by langevin dynamics,” Neural Networks, vol. 137, pp. 1–17, 2021.
- Q. Liu and W. Wen, “Model compression hardens deep neural networks: A new perspective to prevent adversarial attacks,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–12, 2021.
- T. Bai, J. Luo, J. Zhao, B. Wen, and Q. Wang, “Recent advances in adversarial training for adversarial robustness,” arXiv preprint arXiv:2102.01356, 2021.
- J. Zhang, J. Zhu, G. Niu, B. Han, M. Sugiyama, and M. Kankanhalli, “Geometry-aware instance-reweighted adversarial training,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=iAX0l6Cz8ub
- J. Zhang, X. Xu, B. Han, G. Niu, L. Cui, M. Sugiyama, and M. Kankanhalli, “Attacks which do not kill training make adversarial learning stronger,” in International Conference on Machine Learning. PMLR, 2020, pp. 11 278–11 287.
- Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, and Q. Gu, “Improving adversarial robustness requires revisiting misclassified examples,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=rklOg6EFwS
- H. Zhang, Y. Yu, J. Jiao, E. Xing, L. E. Ghaoui, and M. Jordan, “Theoretically principled trade-off between robustness and accuracy,” in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09–15 Jun 2019, pp. 7472–7482. [Online]. Available: https://proceedings.mlr.press/v97/zhang19p.html
- H. Kannan, A. Kurakin, and I. Goodfellow, “Adversarial logit pairing,” arXiv preprint arXiv:1803.06373, 2018.
- R. Rade and S.-M. Moosavi-Dezfooli, “Helper-based adversarial training: Reducing excessive margin to achieve a better accuracy vs. robustness trade-off,” in ICML 2021 Workshop on Adversarial Machine Learning, 2021.
- C. Li, H. Tang, C. Deng, L. Zhan, and W. Liu, “Vulnerability vs. reliability: Disentangled adversarial examples for cross-modal learning,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 421–429.
- H. Liu and G. Ditzler, “Adversarial audio attacks that evade temporal dependency,” in 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020, pp. 639–646.
- V. Srinivasan, E. E. Kuruoglu, K.-R. Müller, W. Samek, and S. Nakajima, “Black-box decision based adversarial attack with symmetric α𝛼\alphaitalic_α-stable distribution,” in 2019 27th European Signal Processing Conference (EUSIPCO). IEEE, 2019, pp. 1–5.
- R. Duan, Y. Chen, D. Niu, Y. Yang, A. K. Qin, and Y. He, “Advdrop: Adversarial attack to dnns by dropping information,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 7506–7515.
- H. Wang, G. Li, X. Liu, and L. Lin, “A hamiltonian monte carlo method for probabilistic adversarial attack and learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- F. Croce and M. Hein, “Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,” in International conference on machine learning. PMLR, 2020, pp. 2206–2216.
- ——, “Minimally distorted adversarial examples with a fast adaptive boundary attack,” in International Conference on Machine Learning. PMLR, 2020, pp. 2196–2205.
- M. Andriushchenko, F. Croce, N. Flammarion, and M. Hein, “Square attack: a query-efficient black-box adversarial attack via random search,” in European Conference on Computer Vision. Springer, 2020, pp. 484–501.
- F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. Chiang, P. Mittal, and M. Hein, “Robustbench: a standardized adversarial robustness benchmark,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. [Online]. Available: https://openreview.net/forum?id=SSKZPJCt7B
- M. AprilPyone and H. Kiya, “Block-wise image transformation with secret key for adversarially robust defense,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2709–2723, 2021.
- D. Meng and H. Chen, “Magnet: a two-pronged defense against adversarial examples,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 135–147.
- Y. Song, T. Kim, S. Nowozin, S. Ermon, and N. Kushman, “Pixeldefend: Leveraging generative models to understand and defend against adversarial examples,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=rJUYGxbCW
- A. S. Ross and F. Doshi-Velez, “Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients,” in Thirty-second AAAI conference on artificial intelligence, 2018.
- A. Chan, Y. Tay, and Y.-S. Ong, “What it thinks is important is important: Robustness transfers through input gradients,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 332–341.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7, no. 7, p. 3, 2015.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual recognition challenge,” International journal of computer vision, vol. 115, no. 3, pp. 211–252, 2015.
- E. Wong, L. Rice, and J. Z. Kolter, “Fast is better than free: Revisiting adversarial training,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=BJx040EFvH
- S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
- W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” arXiv preprint arXiv:1704.01155, 2017.
- C. Guo, M. Rana, M. Cisse, and L. van der Maaten, “Countering adversarial images using input transformations,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=SyJ7ClWCb