VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models (2306.06874v5)
Abstract: Diffusion Models (DMs) are state-of-the-art generative models that learn a reversible corruption process from iterative noise addition and denoising. They are the backbone of many generative AI applications, such as text-to-image conditional generation. However, recent studies have shown that basic unconditional DMs (e.g., DDPM and DDIM) are vulnerable to backdoor injection, a type of output manipulation attack triggered by a maliciously embedded pattern at model input. This paper presents a unified backdoor attack framework (VillanDiffusion) to expand the current scope of backdoor analysis for DMs. Our framework covers mainstream unconditional and conditional DMs (denoising-based and score-based) and various training-free samplers for holistic evaluations. Experiments show that our unified framework facilitates the backdoor analysis of different DM configurations and provides new insights into caption-based backdoor attacks on DMs. Our code is available on GitHub: \url{https://github.com/IBM/villandiffusion}
- Cold diffusion: Inverting arbitrary image transforms without noise. In ArXiv, 2022.
- Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In ICLR, 2022.
- Label-efficient semantic segmentation with diffusion models. In ICLR, 2022.
- Offline reinforcement learning via high-fidelity generative behavior modeling. In ArXiv, 2022.
- Diffusiondet: Diffusion model for object detection. In ArXiv, 2022.
- Trojdiff: Trojan attacks on diffusion models with diverse targets. In CVPR, 2023.
- Diffusion policy: Visuomotor policy learning via action diffusion. 2023.
- Fair generative modeling via weak supervision. In ICML, 2020.
- How to backdoor diffusion models? In CVPR, 2023.
- Soft diffusion: Score matching for general corruptions. In ArXiv, 2022.
- Diffusion models beat gans on image synthesis. In NIPS, 2021.
- Density estimation using real NVP. In ICLR, 2017.
- Advflow: Inconspicuous black-box adversarial attacks using normalizing flows. In NIPS, 2020.
- Bias correction of learned generative models using likelihood-free importance weighting. In NIPS, 2019.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NIPS, 2017.
- Denoising diffusion probabilistic models. In NIPS, 2020.
- Cascaded diffusion models for high fidelity image generation. In JMLR, 2022.
- Classifier-free diffusion guidance. In NIPS Workshop on Deep Generative Models and Downstream Applications, 2021.
- Video diffusion models. In NeurIPS, 2022.
- Lora: Low-rank adaptation of large language models. 2021.
- Fastdiff: A fast conditional diffusion model for high-quality speech synthesis. In IJCAI, 2022.
- Planning with diffusion for flexible behavior synthesis. In ICML, 2022.
- Diff-tts: A denoising diffusion model for text-to-speech. In ISCA, 2021.
- Talk-to-edit: Fine-grained facial editing via dialog. In ICCV, 2021.
- Elucidating the design space of diffusion-based generative models. In NIPS, 2022.
- Guided-tts: A diffusion model for text-to-speech via classifier guidance. In ICML, 2022.
- Glow: Generative flow with invertible 1x1 convolutions. In NIPS, 2018.
- Variational diffusion models. 2021.
- Diffwave: A versatile diffusion model for audio synthesis. In ICLR, 2021.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
- Diffusion-lm improves controllable text generation. In ArXiv, 2022.
- Pseudo numerical methods for diffusion models on manifolds. In ICLR, 2022.
- Deep learning face attributes in the wild. In ICCV, 2015.
- Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In NIPS, 2022.
- Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models. In NIPS, 2022.
- VIDM: video implicit diffusion models. CoRR, abs/2212.00235, 2022.
- GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. In ICML, 2022.
- Imitating human behaviour with diffusion models. In CoRR, 2023.
- Justin N. M. Pinkney. Pokemon blip captions. https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions/, 2022.
- Grad-tts: A diffusion probabilistic model for text-to-speech. In ICML, 2021.
- Hierarchical text-conditional image generation with clip latents. In ArXiv, 2022.
- Variational inference with normalizing flows. In ICML, 2015.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2021.
- Compvis/stable diffusion v1-4. https://huggingface.co/CompVis/stable-diffusion-v1-4, 2022.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. In ArXiv, 2022.
- Improved techniques for training gans. In NIPS, 2016.
- Progressive distillation for fast sampling of diffusion models. In ICLR, 2022.
- LAION-5B: an open large-scale dataset for training next generation image-text models. In NIPS, 2022.
- LAION-400M: open dataset of clip-filtered 400 million image-text pairs. NIPS Workshop, 2021.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, 2015.
- Denoising diffusion implicit models. In ICLR, 2021.
- Maximum likelihood training of score-based diffusion models. In NIPS, 2021.
- Generative modeling by estimating gradients of the data distribution. In NIPS, 2019.
- Improved techniques for training score-based generative models. In NIPS, 2020.
- Score-based generative modeling through stochastic differential equations. In ICLR, 2021.
- Rickrolling the artist: Injecting invisible backdoors into text-guided image generation models. In ArXiv, 2022.
- Diffusion policies as an expressive policy class for offline reinforcement learning. In CoRR, 2022.
- Fast sampling of diffusion models with exponential integrator. In ICLR, 2023.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. 2023.
- Sheng-Yen Chou (4 papers)
- Pin-Yu Chen (311 papers)
- Tsung-Yi Ho (57 papers)