Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks (2110.08247v2)

Published 15 Oct 2021 in cs.CR, cs.AI, and cs.CL

Abstract: Backdoor attacks are a kind of emergent security threat in deep learning. After being injected with a backdoor, a deep neural model will behave normally on standard inputs but give adversary-specified predictions once the input contains specific backdoor triggers. In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. The first trick is to add an extra training task to distinguish poisoned and clean data during the training of the victim model, and the second one is to use all the clean training data rather than remove the original clean data corresponding to the poisoned data. These two tricks are universally applicable to different attack models. We conduct experiments in three tough situations including clean data fine-tuning, low-poisoning-rate, and label-consistent attacks. Experimental results show that the two tricks can significantly improve attack performance. This paper exhibits the great potential harmfulness of backdoor attacks. All the code and data can be obtained at \url{https://github.com/thunlp/StyleAttack}.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yangyi Chen (29 papers)
  2. Fanchao Qi (33 papers)
  3. Hongcheng Gao (28 papers)
  4. Zhiyuan Liu (433 papers)
  5. Maosong Sun (337 papers)
Citations (19)

Summary

We haven't generated a summary for this paper yet.