Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey of Backdoor Attacks and Defenses on Large Language Models: Implications for Security Measures (2406.06852v4)

Published 10 Jun 2024 in cs.CR, cs.AI, and cs.CL

Abstract: LLMs, which bridge the gap between human language understanding and complex problem-solving, achieve state-of-the-art performance on several NLP tasks, particularly in few-shot and zero-shot settings. Despite the demonstrable efficacy of LLMs, due to constraints on computational resources, users have to engage with open-source LLMs or outsource the entire training process to third-party platforms. However, research has demonstrated that LLMs are susceptible to potential security vulnerabilities, particularly in backdoor attacks. Backdoor attacks are designed to introduce targeted vulnerabilities into LLMs by poisoning training samples or model weights, allowing attackers to manipulate model responses through malicious triggers. While existing surveys on backdoor attacks provide a comprehensive overview, they lack an in-depth examination of backdoor attacks specifically targeting LLMs. To bridge this gap and grasp the latest trends in the field, this paper presents a novel perspective on backdoor attacks for LLMs by focusing on fine-tuning methods. Specifically, we systematically classify backdoor attacks into three categories: full-parameter fine-tuning, parameter-efficient fine-tuning, and no fine-tuning Based on insights from a substantial review, we also discuss crucial issues for future research on backdoor attacks, such as further exploring attack algorithms that do not require fine-tuning, or developing more covert attack algorithms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Shuai Zhao (116 papers)
  2. Meihuizi Jia (5 papers)
  3. Zhongliang Guo (14 papers)
  4. Leilei Gan (21 papers)
  5. Jie Fu (229 papers)
  6. Yichao Feng (4 papers)
  7. Fengjun Pan (6 papers)
  8. Luu Anh Tuan (55 papers)
  9. Xiaoyu Xu (27 papers)
  10. Xiaobao Wu (43 papers)
Citations (9)