Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data (2310.06372v2)

Published 10 Oct 2023 in cs.CR, cs.CV, and cs.LG

Abstract: Backdoor attacks pose a serious security threat for training neural networks as they surreptitiously introduce hidden functionalities into a model. Such backdoors remain silent during inference on clean inputs, evading detection due to inconspicuous behavior. However, once a specific trigger pattern appears in the input data, the backdoor activates, causing the model to execute its concealed function. Detecting such poisoned samples within vast datasets is virtually impossible through manual inspection. To address this challenge, we propose a novel approach that enables model training on potentially poisoned datasets by utilizing the power of recent diffusion models. Specifically, we create synthetic variations of all training samples, leveraging the inherent resilience of diffusion models to potential trigger patterns in the data. By combining this generative approach with knowledge distillation, we produce student models that maintain their general performance on the task while exhibiting robust resistance to backdoor triggers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Can machine learning be secure? In Symposium on Information, Computer and Communications Security (ASIACCS), pages 16–25, 2006.
  2. Badnl: Backdoor attacks against NLP models with semantic-preserving improvements. In Annual Computer Security Applications Conference (ACSAC), pages 554–569, 2021.
  3. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint, arXiv:1712.05526, 2017.
  4. How to backdoor diffusion models? In Conference on Computer Vision and Pattern Recognition (CVPR), pages 4015–4024, 2023.
  5. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009.
  6. Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 45:1563–1580, 2023.
  7. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint, arXiv:1708.06733, 2017.
  8. Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  9. Distilling the knowledge in a neural network. arXiv preprint, arXiv:1503.02531, 2015.
  10. Denoising diffusion probabilistic models. In Conference on Neural Information Processing Systems (NeurIPS), 2020.
  11. Jeremy Howard. Imagenette, 2019. URL https://github.com/fastai/imagenette/. Accessed: 2023-08-31.
  12. One-pixel signature: Characterizing CNN models for backdoor detection. In European Conference on Computer Vision (ECCV), pages 326–341, 2020.
  13. Adam: Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR), 2015.
  14. Neural attention distillation: Erasing backdoor triggers from deep neural networks. In International Conference on Learning Representations (ICLR), 2021.
  15. Trojaning attack on neural networks. In Annual Network and Distributed System Security Symposium (NDSS), 2018.
  16. Backdoor cleansing with unlabeled data. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 12218–12227, 2023.
  17. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Conference on Neural Information Processing Systems (NeurIPS), pages 8024–8035, 2019.
  18. Label sanitization against label flipping poisoning attacks. In ECML PKDD Workshop: Recent Advances in Adversarial Machine Learning, volume 11329 of Lecture Notes in Computer Science, pages 5–15, 2018.
  19. High-resolution image synthesis with latent diffusion models. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022.
  20. Hidden trigger backdoor attacks. In AAAI Conference on Artificial Intelligence (AAAI), pages 11957–11965, 2020.
  21. Backdoor attacks on self-supervised learning. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 13337–13346, 2022.
  22. Improved techniques for training score-based generative models. In Conference on Neural Information Processing Systems (NeurIPS), 2020.
  23. Rickrolling the artist: Injecting backdoors into text encoders for text-to-image synthesis. In International Conference on Computer Vision (ICCV), 2023.
  24. Eliminating backdoor triggers for deep neural networks using attention relation graph distillation. In International Joint Conference on Artificial Intelligence (IJCAI), pages 1481–1487, 2022.
  25. Versatile diffusion: Text, images and variations all in one diffusion model. arXiv preprint, arXiv:2211.08332, 2022.
  26. Backdoor attacks to graph neural networks. In Jorge Lobo, Roberto Di Pietro, Omar Chowdhury, and Hongxin Hu, editors, ACM Symposium on Access Control Models and Technologies (SACMAT), pages 15–26, 2021.
  27. How to inject backdoors with better consistency: Logit anchoring on clean data. In International Conference on Learning Representations (ICLR), 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Lukas Struppek (21 papers)
  2. Martin B. Hentschel (1 paper)
  3. Clifton Poth (6 papers)
  4. Dominik Hintersdorf (17 papers)
  5. Kristian Kersting (205 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com