Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Removing Undesirable Concepts in Text-to-Image Diffusion Models with Learnable Prompts (2403.12326v2)

Published 18 Mar 2024 in cs.LG and cs.CV

Abstract: Diffusion models have shown remarkable capability in generating visually impressive content from textual descriptions. However, these models are trained on vast internet data, much of which contains undesirable elements such as sensitive content, copyrighted material, and unethical or harmful concepts. Therefore, beyond generating high-quality content, it is crucial to ensure these models do not propagate these undesirable elements. To address this issue, we propose a novel method to remove undesirable concepts from text-to-image diffusion models by incorporating a learnable prompt into the cross-attention module. This learnable prompt acts as additional memory, capturing the knowledge of undesirable concepts and reducing their dependency on the model parameters and corresponding textual inputs. By transferring this knowledge to the prompt, erasing undesirable concepts becomes more stable and has minimal negative impact on other concepts. We demonstrate the effectiveness of our method on the Stable Diffusion model, showcasing its superiority over state-of-the-art erasure methods in removing undesirable content while preserving unrelated elements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017.
  2. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
  3. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022.
  4. Unified concept editing in diffusion models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.  5111–5120, 2024.
  5. Rohit Gandikota et al. Erasing concepts from diffusion models. ICCV, 2023.
  6. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  770–778, 2016. doi: 10.1109/CVPR.2016.90.
  7. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  8. Lora: Low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  9. Ablating concepts in text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  22691–22702, 2023.
  10. The power of scale for parameter-efficient prompt tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021. doi: 10.18653/v1/2021.emnlp-main.243.
  11. Prefix-tuning: Optimizing continuous prompts for generation. CoRR, abs/2101.00190, 2021.
  12. Editing implicit assumptions in text-to-image diffusion models. In IEEE International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023, pp.  7030–7038. IEEE, 2023. doi: 10.1109/ICCV51070.2023.00649.
  13. Adapterfusion: Non-destructive task composition for transfer learning. CoRR, abs/2005.00247, 2020.
  14. Bedapudi Praneet. Nudenet: Neural nets for nudity classification, detection and selective censorin. 2019.
  15. Unsafe diffusion: On the generation of unsafe images and hateful memes from text-to-image models. arXiv preprint arXiv:2305.13873, 2023.
  16. Zero-shot text-to-image generation. In International Conference on Machine Learning, pp.  8821–8831. PMLR, 2021.
  17. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  18. Javier Rando et al. Red-teaming the stable diffusion safety filter. NeurIPS Workshop MLSW, 2022.
  19. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  20. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  22500–22510, 2023.
  21. Patrick Schramowski et al. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. In CVPR, 2023.
  22. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  23. StabilityAI. Stable diffusion 2.0 release. 2022. URL https://stability.ai/blog/stable-diffusion-v2-release.
  24. What the daam: Interpreting stable diffusion using cross attention. arXiv preprint arXiv:2210.04885, 2022.
  25. Anti-dreambooth: Protecting users from personalized text-to-image synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  2116–2127, 2023.
  26. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  27. K-adapter: Infusing knowledge into pre-trained models with adapters, 2021.
  28. Mika Westerlund. The emergence of deepfake technology: A review. Technology innovation management review, 9(11), 2019.
  29. Sneakyprompt: Jailbreaking text-to-image generative models. In 2024 IEEE Symposium on Security and Privacy (SP), pp.  123–123. IEEE Computer Society, 2024.
  30. Eric Zhang et al. Forget-me-not: Learning to forget in text-to-image diffusion models. arXiv preprint arXiv:2303.17591, 2023.
  31. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  586–595, 2018.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets