Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Trigger Word Insertion (2311.13957v1)

Published 23 Nov 2023 in cs.CR and cs.CL

Abstract: With the boom in the NLP field these years, backdoor attacks pose immense threats against deep neural network models. However, previous works hardly consider the effect of the poisoning rate. In this paper, our main objective is to reduce the number of poisoned samples while still achieving a satisfactory Attack Success Rate (ASR) in text backdoor attacks. To accomplish this, we propose an efficient trigger word insertion strategy in terms of trigger word optimization and poisoned sample selection. Extensive experiments on different datasets and models demonstrate that our proposed method can significantly improve attack effectiveness in text classification tasks. Remarkably, our approach achieves an ASR of over 90% with only 10 poisoned samples in the dirty-label setting and requires merely 1.5% of the training data in the clean-label setting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. X. Qiu, T. Sun, Y. Xu, Y. Shao, N. Dai, and X. Huang, “Pre-trained models for natural language processing: A survey,” Science China Technological Sciences, vol. 63, no. 10, pp. 1872–1897, 2020.
  2. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  3. Z. Li, P. Xia, R. Tao, H. Niu, and B. Li, “A new perspective on stabilizing gans training: Direct adversarial training,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 7, no. 1, pp. 178–189, 2022.
  4. Z. Li, M. Usman, R. Tao, P. Xia, C. Wang, H. Chen, and B. Li, “A systematic survey of regularization and normalization in gans,” ACM Computing Surveys, vol. 55, no. 11, pp. 1–37, 2023.
  5. F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, “A comprehensive survey on transfer learning,” Proceedings of the IEEE, vol. 109, no. 1, pp. 43–76, 2020.
  6. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  7. C. Sun, X. Qiu, Y. Xu, and X. Huang, “How to fine-tune bert for text classification?” in Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18.   Springer, 2019, pp. 194–206.
  8. E. Wallace, S. Feng, N. Kandpal, M. Gardner, and S. Singh, “Universal adversarial triggers for attacking and analyzing nlp,” arXiv preprint arXiv:1908.07125, 2019.
  9. K. Kurita, P. Michel, and G. Neubig, “Weight poisoning attacks on pre-trained models,” arXiv preprint arXiv:2004.06660, 2020.
  10. F. Qi, Y. Chen, X. Zhang, M. Li, Z. Liu, and M. Sun, “Mind the style of text! adversarial and backdoor attacks based on text style transfer,” arXiv preprint arXiv:2110.07139, 2021.
  11. J. Dai, C. Chen, and Y. Li, “A backdoor attack against lstm-based text classification systems,” IEEE Access, vol. 7, pp. 138 872–138 878, 2019.
  12. X. Chen, A. Salem, D. Chen, M. Backes, S. Ma, Q. Shen, Z. Wu, and Y. Zhang, “Badnl: Backdoor attacks against nlp models with semantic-preserving improvements,” in Annual Computer Security Applications Conference, 2021, pp. 554–569.
  13. F. Qi, M. Li, Y. Chen, Z. Zhang, Z. Liu, Y. Wang, and M. Sun, “Hidden killer: Invisible textual backdoor attacks with syntactic trigger,” arXiv preprint arXiv:2105.12400, 2021.
  14. F. Qi, Y. Yao, S. Xu, Z. Liu, and M. Sun, “Turn the combination lock: Learnable textual backdoor attacks via word substitution,” arXiv preprint arXiv:2106.06361, 2021.
  15. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
  16. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  17. S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial perturbations,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1765–1773.
  18. P. Xia, Z. Li, W. Zhang, and B. Li, “Data-efficient backdoor attacks,” arXiv preprint arXiv:2204.12281, 2022.
  19. Z. Li, P. Xia, H. Sun, Y. Zeng, W. Zhang, and B. Li, “Explore the effect of data selection on poison efficiency in backdoor attacks,” arXiv preprint arXiv:2310.09744, 2023.
  20. Z. Li, H. Sun, P. Xia, B. Xia, X. Rui, W. Zhang, and B. Li, “A proxy-free strategy for practically improving the poisoning efficiency in backdoor attacks,” arXiv preprint arXiv:2306.08313, 2023.
  21. P. Xia, Y. Zeng, Z. Li, W. Zhang, and B. Li, “Efficient trojan injection: 90% attack success rate using 0.04% poisoned samples,” 2023. [Online]. Available: https://openreview.net/forum?id=ogsUO9JHZu0
  22. P. Xia, H. Niu, Z. Li, and B. Li, “Enhancing backdoor attacks with multi-level mmd regularization,” IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 2, pp. 1675–1686, 2022.
  23. S. Zhuang, P. Xia, and B. Li, “An empirical study of backdoor attacks on masked auto encoders,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2023, pp. 1–5.
  24. H. Sun, Z. Li, P. Xia, H. Li, B. Xia, Y. Wu, and B. Li, “Efficient backdoor attacks for deep neural networks in real-world scenarios,” arXiv preprint arXiv:2306.08386, 2023.
  25. T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnerabilities in the machine learning model supply chain,” arXiv preprint arXiv:1708.06733, 2017.
  26. Y. Liu, “Fine-tune bert for extractive summarization,” arXiv preprint arXiv:1903.10318, 2019.
  27. W. Yang, L. Li, Z. Zhang, X. Ren, X. Sun, and B. He, “Be careful about poisoned word embeddings: Exploring the vulnerability of the embedding layers in nlp models,” arXiv preprint arXiv:2103.15543, 2021.
  28. X. Zhang, Z. Zhang, S. Ji, and T. Wang, “Trojaning language models for fun and profit,” in 2021 IEEE European Symposium on Security and Privacy (EuroS&P).   IEEE, 2021, pp. 179–197.
  29. X. Chen, Y. Dong, Z. Sun, S. Zhai, Q. Shen, and Z. Wu, “Kallima: A clean-label framework for textual backdoor attacks,” in Computer Security–ESORICS 2022: 27th European Symposium on Research in Computer Security, Copenhagen, Denmark, September 26–30, 2022, Proceedings, Part I.   Springer, 2022, pp. 447–466.
  30. J. Yan, V. Gupta, and X. Ren, “Bite: Textual backdoor attacks with iterative trigger injection,” in ICLR 2023 Workshop on Backdoor Attacks and Defenses in Machine Learning, 2023.
  31. H. Zhong, C. Liao, A. C. Squicciarini, S. Zhu, and D. Miller, “Backdoor embedding in convolutional neural network models via invisible perturbation,” in Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, 2020, pp. 97–108.
  32. A. Katharopoulos and F. Fleuret, “Not all samples are created equal: Deep learning with importance sampling,” in International conference on machine learning.   PMLR, 2018, pp. 2525–2534.
  33. R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631–1642.
  34. X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in neural information processing systems, vol. 28, 2015.
  35. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yueqi Zeng (3 papers)
  2. Ziqiang Li (40 papers)
  3. Pengfei Xia (28 papers)
  4. Lei Liu (332 papers)
  5. Bin Li (514 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.