Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models (2404.01101v1)

Published 1 Apr 2024 in cs.CR, cs.CV, and cs.LG

Abstract: Diffusion Models are vulnerable to backdoor attacks, where malicious attackers inject backdoors by poisoning some parts of the training samples during the training stage. This poses a serious threat to the downstream users, who query the diffusion models through the API or directly download them from the internet. To mitigate the threat of backdoor attacks, there have been a plethora of investigations on backdoor detections. However, none of them designed a specialized backdoor detection method for diffusion models, rendering the area much under-explored. Moreover, these prior methods mainly focus on the traditional neural networks in the classification task, which cannot be adapted to the backdoor detections on the generative task easily. Additionally, most of the prior methods require white-box access to model weights and architectures, or the probability logits as additional information, which are not always practical. In this paper, we propose a Unified Framework for Input-level backdoor Detection (UFID) on the diffusion models, which is motivated by observations in the diffusion models and further validated with a theoretical causality analysis. Extensive experiments across different datasets on both conditional and unconditional diffusion models show that our method achieves a superb performance on detection effectiveness and run-time efficiency. The code is available at https://github.com/GuanZihan/official_UFID.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Elijah: Eliminating backdoors injected in diffusion models via distribution shift. arXiv preprint arXiv:2312.00050, 2023.
  2. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
  3. Blended diffusion for text-driven editing of natural images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18208–18218, 2022.
  4. A textbook of graph theory. Springer Science & Business Media, 2012.
  5. Label-efficient semantic segmentation with diffusion models. arXiv preprint arXiv:2112.03126, 2021.
  6. Conditional image generation with score-based diffusion models. arXiv preprint arXiv:2111.13606, 2021.
  7. Threat model-agnostic adversarial defense using diffusion models. arXiv preprint arXiv:2207.08089, 2022.
  8. Denoising pretraining for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4175–4186, 2022.
  9. (certified!!) adversarial robustness for free! arXiv preprint arXiv:2206.10550, 2022.
  10. The convex geometry of linear inverse problems. Foundations of Computational mathematics, 12:805–849, 2012.
  11. Trojdiff: Trojan attacks on diffusion models with diverse targets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4035–4044, 2023.
  12. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
  13. Ilvr: Conditioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938, 2021.
  14. How to backdoor diffusion models? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4015–4024, June 2023.
  15. Villandiffusion: A unified backdoor attack framework for diffusion models. arXiv preprint arXiv:2306.06874, 2023.
  16. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  17. Towards interpreting and mitigating shortcut learning behavior of nlu models. arXiv preprint arXiv:2103.06922, 2021.
  18. Diffusion models as plug-and-play priors. Advances in Neural Information Processing Systems, 35:14715–14728, 2022.
  19. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
  20. Badsam: Exploring security vulnerabilities of sam via backdoor attacks (student abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 38(21):23506–23507, Mar. 2024.
  21. SCALE-UP: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency. In The Eleventh International Conference on Learning Representations, 2023.
  22. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  23. Argmax flows and multinomial diffusion: Learning categorical distributions. Advances in Neural Information Processing Systems, 34:12454–12465, 2021.
  24. Gaia-1: A generative world model for autonomous driving. arXiv preprint arXiv:2309.17080, 2023.
  25. Talk-to-edit: Fine-grained facial editing via dialog. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13799–13808, 2021.
  26. Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario, 2009.
  27. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31, 2018.
  28. Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems, 35:4328–4343, 2022.
  29. Anti-backdoor learning: Training clean models on poisoned data. Advances in Neural Information Processing Systems, 34:14900–14912, 2021.
  30. Detecting backdoors during the inference stage based on corruption robustness consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16363–16372, 2023.
  31. Complex backdoor detection by symmetric feature differencing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15003–15013, 2022.
  32. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
  33. Fast unsupervised brain anomaly detection and segmentation with diffusion models. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 705–714. Springer, 2022.
  34. Justin N. M. Pinkney. Pokemon blip captions. https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions/, 2022.
  35. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  36. High-resolution image synthesis with latent diffusion models, 2021.
  37. Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4713–4726, 2022.
  38. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  39. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  40. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
  41. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  42. Rickrolling the artist: Injecting invisible backdoors into text-guided image generation models. arXiv preprint arXiv:2211.02408, 2022.
  43. The stronger the diffusion model, the easier the backdoor: Data poisoning to induce copyright breaches without adjusting finetuning pipeline. arXiv preprint arXiv:2401.04136, 2024.
  44. Visual transformers: Token-based image representation and processing for computer vision, 2020.
  45. Post-training detection of backdoor attacks for two-class and multi-attack scenarios. arXiv preprint arXiv:2201.08474, 2022.
  46. Rethinking the backdoor attacks’ triggers: A frequency perspective. In Proceedings of the IEEE/CVF international conference on computer vision, pages 16473–16481, 2021.
  47. Backdoor defense via deconfounded representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12228–12238, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zihan Guan (11 papers)
  2. Mengxuan Hu (14 papers)
  3. Sheng Li (217 papers)
  4. Anil Vullikanti (41 papers)
Citations (6)