Towards Memorization-Free Diffusion Models (2404.00922v1)
Abstract: Pretrained diffusion models and their outputs are widely accessible due to their exceptional capacity for synthesizing high-quality images and their open-source nature. The users, however, may face litigation risks owing to the models' tendency to memorize and regurgitate training data during inference. To address this, we introduce Anti-Memorization Guidance (AMG), a novel framework employing three targeted guidance strategies for the main causes of memorization: image and caption duplication, and highly specific user prompts. Consequently, AMG ensures memorization-free outputs while maintaining high image quality and text alignment, leveraging the synergy of its guidance methods, each indispensable in its own right. AMG also features an innovative automatic detection system for potential memorization during each step of inference process, allows selective application of guidance strategies, minimally interfering with the original sampling process to preserve output utility. We applied AMG to pretrained Denoising Diffusion Probabilistic Models (DDPM) and Stable Diffusion across various generation tasks. The results demonstrate that AMG is the first approach to successfully eradicates all instances of memorization with no or marginal impacts on image quality and text-alignment, as evidenced by FID and CLIP scores.
- Midjourney. https://www.midjourney.com/.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
- Romain Beaumont. Clip retrieval: Easily compute clip embeddings and build a clip retrieval system with them. https://github.com/rom1504/clip-retrieval, 2022.
- Extracting training data from diffusion models. In USENIX Security Symposium, pages 5253–5270, 2023.
- Private image generation with dual-purpose auxiliary classifier. In CVPR, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE, 2009.
- Diffusion models beat gans on image synthesis. In NeurIPS, pages 8780–8794, 2021.
- Differentially Private Diffusion Models. Transactions on Machine Learning Research, 2023.
- The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
- Generative adversarial nets. In NeurIPS, 2014.
- Classifier-free diffusion guidance. arXiv:2207.12598, 2022.
- Denoising diffusion probabilistic models. In NeurIPS, pages 6840–6851, 2020.
- Deduplicating training data mitigates privacy risks in language models. In ICML, 2022.
- Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
- A style-based generator architecture for generative adversarial networks. In CVPR, pages 4401–4410, 2019.
- Analyzing and improving the image quality of StyleGAN. In CVPR, pages 8110–8119, 2020.
- Auto-encoding variational bayes. In ICLR, 2014.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- Ablating concepts in text-to-image diffusion models. In ICCV, 2023.
- Deduplicating training data makes language models better. In ACL, pages 8424–8445, 2022.
- Dataset distillation via factorization. Advances in neural information processing systems, 35:1100–1113, 2022.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
- Improved denoising diffusion probabilistic models. In ICML, 2021.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. 2021.
- Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, pages 722–729, 2008.
- A self-supervised descriptor for image copy detection. In CVPR, pages 14532–14542, 2022.
- Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020, 2021.
- Hierarchical text-conditional image generation with clip latents. 2022.
- Variational inference with normalizing flows. In ICML, pages 1530–1538, 2015.
- High-resolution image synthesis with latent diffusion models. In CVPR, pages 10684–10695, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022.
- Stable diffusion litigation, 2023. 2023.
- Laion-5b: An open large-scale dataset for training next generation image-text models. 2022.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, pages 2256–2265, 2015.
- Diffusion art or digital forgery? investigating data replication in diffusion models. In CVPR, pages 6048–6058, 2023a.
- Understanding and mitigating copying in diffusion models. In NeurIPS, 2023b.
- Denoising diffusion implicit models. In ICLR, 2021a.
- Score-based generative modeling through stochastic differential equations. In ICLR, 2021b.
- Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
- Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739, 2018.
- Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. In arXiv preprint arXiv:1506.03365, 2015.
- Chen Chen (753 papers)
- Daochang Liu (19 papers)
- Chang Xu (323 papers)