Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification (2012.11212v2)

Published 21 Dec 2020 in cs.LG and cs.CV

Abstract: Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data. The backdoor can be activated when a normal input is stamped with a certain pattern called trigger, causing misclassification. Many existing trojan attacks have their triggers being input space patches/objects (e.g., a polygon with solid color) or simple input transformations such as Instagram filters. These simple triggers are susceptible to recent backdoor detection algorithms. We propose a novel deep feature space trojan attack with five characteristics: effectiveness, stealthiness, controllability, robustness and reliance on deep features. We conduct extensive experiments on 9 image classifiers on various datasets including ImageNet to demonstrate these properties and show that our attack can evade state-of-the-art defense.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Siyuan Cheng (41 papers)
  2. Yingqi Liu (28 papers)
  3. Shiqing Ma (56 papers)
  4. Xiangyu Zhang (328 papers)
Citations (143)