The Efficacy of Transformer-based Adversarial Attacks in Security Domains (2310.11597v1)
Abstract: Today, the security of many domains rely on the use of Machine Learning to detect threats, identify vulnerabilities, and safeguard systems from attacks. Recently, transformer architectures have improved the state-of-the-art performance on a wide range of tasks such as malware detection and network intrusion detection. But, before abandoning current approaches to transformers, it is crucial to understand their properties and implications on cybersecurity applications. In this paper, we evaluate the robustness of transformers to adversarial samples for system defenders (i.e., resiliency to adversarial perturbations generated on different types of architectures) and their adversarial strength for system attackers (i.e., transferability of adversarial samples generated by transformers to other target models). To that effect, we first fine-tune a set of pre-trained transformer, Convolutional Neural Network (CNN), and hybrid (an ensemble of transformer and CNN) models to solve different downstream image-based tasks. Then, we use an attack algorithm to craft 19,367 adversarial examples on each model for each task. The transferability of these adversarial examples is measured by evaluating each set on other models to determine which models offer more adversarial strength, and consequently, more robustness against these attacks. We find that the adversarial examples crafted on transformers offer the highest transferability rate (i.e., 25.7% higher than the average) onto other models. Similarly, adversarial examples crafted on other models have the lowest rate of transferability (i.e., 56.7% lower than the average) onto transformers. Our work emphasizes the importance of studying transformer architectures for attacking and defending models in security domains, and suggests using them as the primary architecture in transfer attack settings.
- Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, and C. Wang, “Machine learning and deep learning methods for cybersecurity,” IEEE Access, vol. 6, pp. 35 365–35 381, 2018.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is All you Need.”
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE,” 2021.
- A. Rahali and M. A. Akhloufi, “Malbert: Using transformers for cybersecurity and malicious software detection,” 2021.
- Z. Wu, H. Zhang, P. Wang, and Z. Sun, “Rtids: A robust transformer-based approach for intrusion detection system,” IEEE Access, vol. 10, pp. 64 375–64 387, 2022.
- Z. Lin, Y. Shi, and Z. Xue, “IDSGAN: Generative Adversarial Networks for Attack Generation Against Intrusion Detection,” in Advances in Knowledge Discovery and Data Mining, ser. Lecture Notes in Computer Science, J. Gama, T. Li, Y. Yu, E. Chen, Y. Zheng, and F. Teng, Eds. Cham: Springer International Publishing, 2022, pp. 79–91.
- R. Sheatsley, B. Hoak, E. Pauley, and P. McDaniel, “The Space of Adversarial Strategies,” Sep. 2022, arXiv:2209.04521 [cs]. [Online]. Available: http://arxiv.org/abs/2209.04521
- N. Carlini and D. Wagner, “Towards Evaluating the Robustness of Neural Networks,” Mar. 2017, arXiv:1608.04644 [cs]. [Online]. Available: http://arxiv.org/abs/1608.04644
- N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical Black-Box Attacks against Machine Learning,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ser. ASIA CCS ’17. New York, NY, USA: Association for Computing Machinery, Apr. 2017, pp. 506–519. [Online]. Available: https://dl.acm.org/doi/10.1145/3052973.3053009
- Y.-L. Hsieh, M. Cheng, D.-C. Juan, W. Wei, W.-L. Hsu, and C.-J. Hsieh, “On the robustness of self-attentive models,” in Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
- K. Mahmood, R. Mahmood, and M. van Dijk, “On the Robustness of Vision Transformers to Adversarial Examples,” Jun. 2021, arXiv:2104.02610 [cs]. [Online]. Available: http://arxiv.org/abs/2104.02610
- S. Bhojanapalli, A. Chakrabarti, D. Glasner, D. Li, T. Unterthiner, and A. Veit, “Understanding Robustness of Transformers for Image Classification,” Oct. 2021, arXiv:2103.14586 [cs]. [Online]. Available: http://arxiv.org/abs/2103.14586
- R. Shao, Z. Shi, J. Yi, P.-Y. Chen, and C.-J. Hsieh, “On the Adversarial Robustness of Vision Transformers,” Nov. 2022, arXiv:2103.15670 [cs]. [Online]. Available: http://arxiv.org/abs/2103.15670
- S. Paul and P.-Y. Chen, “Vision Transformers are Robust Learners,” Dec. 2021, arXiv:2105.07581 [cs]. [Online]. Available: http://arxiv.org/abs/2105.07581
- P. Benz, S. Ham, C. Zhang, A. Karjauv, and I. S. Kweon, “Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs,” Oct. 2021, arXiv:2110.02797 [cs]. [Online]. Available: http://arxiv.org/abs/2110.02797
- R. Bommasani, D. A. Hudson, E. A. R. Altman, and S. Arora, “On the Opportunities and Risks of Foundation Models.”
- OpenAI, “GPT-4 Technical Report,” Mar. 2023, arXiv:2303.08774 [cs]. [Online]. Available: http://arxiv.org/abs/2303.08774
- A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical Text-Conditional Image Generation with CLIP Latents,” Apr. 2022, arXiv:2204.06125 [cs]. [Online]. Available: http://arxiv.org/abs/2204.06125
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Dec. 2015, arXiv:1512.03385 [cs]. [Online]. Available: http://arxiv.org/abs/1512.03385
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” Aug. 2021, arXiv:2103.14030 [cs]. [Online]. Available: http://arxiv.org/abs/2103.14030
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A ConvNet for the 2020s,” Mar. 2022, arXiv:2201.03545 [cs]. [Online]. Available: http://arxiv.org/abs/2201.03545
- H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” Jan. 2021, arXiv:2012.12877 [cs]. [Online]. Available: http://arxiv.org/abs/2012.12877
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Ha, “Gradient-Based Learning Applied to Document Recognition,” 1998.
- K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Apr. 2015, arXiv:1409.1556 [cs]. [Online]. Available: http://arxiv.org/abs/1409.1556
- S. d’Ascoli, H. Touvron, M. Leavitt, A. Morcos, G. Biroli, and L. Sagun, “ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 11, p. 114005, Nov. 2022, arXiv:2103.10697 [cs, stat]. [Online]. Available: http://arxiv.org/abs/2103.10697
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples,” Mar. 2015, arXiv:1412.6572 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1412.6572
- F. Croce and M. Hein, “Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,” Aug. 2020, arXiv:2003.01690 [cs, stat]. [Online]. Available: http://arxiv.org/abs/2003.01690
- S. Garg and G. Ramakrishnan, “BAE: BERT-based Adversarial Examples for Text Classification,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 6174–6181, arXiv:2004.01970 [cs]. [Online]. Available: http://arxiv.org/abs/2004.01970
- S. Goyal, S. Doddapaneni, M. M. Khapra, and B. Ravindran, “A Survey of Adversarial Defenses and Robustness in NLP,” ACM Computing Surveys, vol. 55, no. 14s, pp. 332:1–332:39, Jul. 2023. [Online]. Available: https://dl.acm.org/doi/10.1145/3593042
- R. Wightman, N. Raw, A. Soare, A. Arora, C. Ha, C. Reich, F. Guan, J. Kaczmarzyk, mrT23, Mike, SeeFun, contrastive, M. Rizin, H. Kim, C. Kertész, D. Mehta, G. Cucurull, K. Singh, hankyul, Y. Tatsunami, A. Lavin, J. Zhuang, M. Hollemans, M. Rashad, S. Sameni, V. Shults, Lucain, X. Wang, Y. Kwon, and Y. Uchida, “rwightman/pytorch-image-models: v0.8.10dev0 Release,” Feb. 2023. [Online]. Available: https://zenodo.org/record/7618837
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2009, pp. 248–255, iSSN: 1063-6919.
- A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images.”
- A. Coates, H. Lee, and A. Y. Ng, “An Analysis of Single-Layer Networks in Unsupervised Feature Learning.”