Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Attention-Enhancing Backdoor Attacks Against BERT-based Models (2310.14480v2)

Published 23 Oct 2023 in cs.LG

Abstract: Recent studies have revealed that \textit{Backdoor Attacks} can threaten the safety of NLP models. Investigating the strategies of backdoor attacks will help to understand the model's vulnerability. Most existing textual backdoor attacks focus on generating stealthy triggers or modifying model weights. In this paper, we directly target the interior structure of neural networks and the backdoor mechanism. We propose a novel Trojan Attention Loss (TAL), which enhances the Trojan behavior by directly manipulating the attention patterns. Our loss can be applied to different attacking methods to boost their attack efficacy in terms of attack successful rates and poisoning rates. It applies to not only traditional dirty-label attacks, but also the more challenging clean-label attacks. We validate our method on different backbone models (BERT, RoBERTa, and DistilBERT) and various tasks (Sentiment Analysis, Toxic Detection, and Topic Classification).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Weimin Lyu (19 papers)
  2. Songzhu Zheng (12 papers)
  3. Lu Pang (7 papers)
  4. Haibin Ling (142 papers)
  5. Chao Chen (661 papers)
Citations (30)