Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Backdoor Attack with Mode Mixture Latent Modification (2403.07463v1)

Published 12 Mar 2024 in cs.CR and cs.CV

Abstract: Backdoor attacks become a significant security concern for deep neural networks in recent years. An image classification model can be compromised if malicious backdoors are injected into it. This corruption will cause the model to function normally on clean images but predict a specific target label when triggers are present. Previous research can be categorized into two genres: poisoning a portion of the dataset with triggered images for users to train the model from scratch, or training a backdoored model alongside a triggered image generator. Both approaches require significant amount of attackable parameters for optimization to establish a connection between the trigger and the target label, which may raise suspicions as more people become aware of the existence of backdoor attacks. In this paper, we propose a backdoor attack paradigm that only requires minimal alterations (specifically, the output layer) to a clean model in order to inject the backdoor under the guise of fine-tuning. To achieve this, we leverage mode mixture samples, which are located between different modes in latent space, and introduce a novel method for conducting backdoor attacks. We evaluate the effectiveness of our method on four popular benchmark datasets: MNIST, CIFAR-10, GTSRB, and TinyImageNet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Ae-ot: A new generative model based on extended semi-discrete optimal transport. ICLR 2020, 2019.
  2. Square attack: a query-efficient black-box adversarial attack via random search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIII, pages 484–501. Springer, 2020.
  3. A theory of learning from different domains. Machine learning, 79:151–175, 2010.
  4. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728, 2018.
  5. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
  6. Backdoor attack with imperceptible input and latent modification. Advances in Neural Information Processing Systems, 34:18944–18957, 2021.
  7. Lira: Learnable, imperceptible and robust backdoor attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11966–11976, 2021.
  8. Sparse adversarial attack via perturbation factorization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pages 35–50. Springer, 2020.
  9. Alessio Figalli. Regularity properties of optimal maps between nonconvex domains in the plane. Communications in Partial Differential Equations, 35(3):465–479, 2010.
  10. Strip: A defence against trojan attacks on deep neural networks. In Proceedings of the 35th Annual Computer Security Applications Conference, pages 113–125, 2019.
  11. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  12. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  14. Distilling cognitive backdoor patterns within an image. arXiv preprint arXiv:2301.10908, 2023.
  15. Disconnected manifold learning for generative adversarial networks. Advances in Neural Information Processing Systems, 31, 2018.
  16. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  17. Learning multiple layers of features from tiny images. 2009.
  18. Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  19. Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
  20. Anti-backdoor learning: Training clean models on poisoned data. Advances in Neural Information Processing Systems, 34:14900–14912, 2021.
  21. Fine-pruning: Defending against backdooring attacks on deep neural networks. In Research in Attacks, Intrusions, and Defenses: 21st International Symposium, RAID 2018, Heraklion, Crete, Greece, September 10-12, 2018, Proceedings 21, pages 273–294. Springer, 2018.
  22. Reflection backdoor: A natural backdoor attack on deep neural networks. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, pages 182–199. Springer, 2020.
  23. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  24. Gradient descent gan optimization is locally stable. Advances in neural information processing systems, 30, 2017.
  25. Wanet–imperceptible warping-based backdoor attack. arXiv preprint arXiv:2102.10369, 2021.
  26. Input-aware dynamic backdoor attack. Advances in Neural Information Processing Systems, 33:3454–3464, 2020.
  27. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 11957–11965, 2020.
  28. Reza Shokri et al. Bypassing backdoor detection algorithms in deep learning. In 2020 IEEE European Symposium on Security and Privacy (EuroS&P), pages 175–183. IEEE, 2020.
  29. The german traffic sign recognition benchmark: a multi-class classification competition. In The 2011 international joint conference on neural networks, pages 1453–1460. IEEE, 2011.
  30. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In 2019 IEEE Symposium on Security and Privacy (SP), pages 707–723. IEEE, 2019.
  31. Data-efficient backdoor attacks. arXiv preprint arXiv:2204.12281, 2022.
  32. Bourgan: Generative networks with metric embeddings. Advances in Neural Information Processing Systems, 31, 2018.
  33. Facial misrecognition systems: Simple weight manipulations force dnns to err only on specific persons. arXiv preprint arXiv:2301.03118, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Hongwei Zhang (75 papers)
  2. Xiaoyin Xu (6 papers)
  3. Dongsheng An (10 papers)
  4. Xianfeng Gu (41 papers)
  5. Min Zhang (630 papers)

Summary

We haven't generated a summary for this paper yet.