Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models (2403.19738v1)

Published 28 Mar 2024 in cs.CV

Abstract: Diffusion-based text-to-image models have rapidly gained popularity for their ability to generate detailed and realistic images from textual descriptions. However, these models often reflect the biases present in their training data, especially impacting marginalized groups. While prior efforts to debias LLMs have focused on addressing specific biases, such as racial or gender biases, efforts to tackle intersectional bias have been limited. Intersectional bias refers to the unique form of bias experienced by individuals at the intersection of multiple social identities. Addressing intersectional bias is crucial because it amplifies the negative effects of discrimination based on race, gender, and other identities. In this paper, we introduce a method that addresses intersectional bias in diffusion-based text-to-image models by modifying cross-attention maps in a disentangled manner. Our approach utilizes a pre-trained Stable Diffusion model, eliminates the need for an additional set of reference images, and preserves the original quality for unaltered concepts. Comprehensive experiments demonstrate that our method surpasses existing approaches in mitigating both single and intersectional biases across various attributes. We make our source code and debiased models for various attributes available to encourage fairness in generative models and to support further research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Data quality in imitation learning. arXiv preprint arXiv:2306.02437, 2023.
  2. Multimodal datasets: misogyny, pornography, and malignant stereotypes. arXiv preprint arXiv:2110.01963, 2021.
  3. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29, 2016.
  4. mclip: Multilingual clip via cross-lingual transfer. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13028–13043, 2023.
  5. Uniter: Universal image-text representation learning. In European conference on computer vision, pages 104–120. Springer, 2020.
  6. Debiasing vision-language models via biased prompts. arXiv preprint arXiv:2302.00070, 2023.
  7. Improving fairness using vision-language driven image augmentation. arXiv preprint arXiv:2311.01573, 2023.
  8. Fair diffusion: Instructing text-to-image generation models on fairness. arXiv preprint arXiv:2302.10893, 2023.
  9. An image is worth one word: Personalizing text-to-image generation using textual inversion, 2022.
  10. Erasing concepts from diffusion models. In Proceedings of the 2023 IEEE International Conference on Computer Vision, 2023.
  11. Unified concept editing in diffusion models. IEEE/CVF Winter Conference on Applications of Computer Vision, 2024.
  12. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  13. Unpacking the interdependent systems of discrimination: Ableist bias in NLP systems through an intersectional lens. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3116–3123, Punta Cana, Dominican Republic, Nov. 2021. Association for Computational Linguistics.
  14. Prompt-to-prompt image editing with cross attention control. 2022.
  15. Fairstyle: Debiasing stylegan2 with style channel manipulations. In European Conference on Computer Vision, pages 570–586. Springer, 2022.
  16. Imagic: Text-based real image editing with diffusion models. In Conference on Computer Vision and Pattern Recognition 2023, 2023.
  17. Vilt: Vision-and-language transformer without convolution or region supervision. In International Conference on Machine Learning, pages 5583–5594. PMLR, 2021.
  18. Ablating concepts in text-to-image diffusion models, 2023.
  19. Benchmarking intersectional biases in NLP. In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3598–3609, Seattle, United States, July 2022. Association for Computational Linguistics.
  20. Michael Lepori. Unequal representations: Analyzing intersectional biases in word embeddings using representational similarity analysis. In Donia Scott, Nuria Bel, and Chengqing Zong, editors, Proceedings of the 28th International Conference on Computational Linguistics, pages 1720–1728, Barcelona, Spain (Online), Dec. 2020. International Committee on Computational Linguistics.
  21. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  22. Editing implicit assumptions in text-to-image diffusion models. arXiv:2303.08084, 2023.
  23. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  24. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  25. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  26. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. 2022.
  27. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  28. Fairness gan: Generating datasets with fairness properties using a generative adversarial network. IBM Journal of Research and Development, 63(4/5):3–1, 2019.
  29. Neural machine translation doesn’t translate gender coreference right unless you make it. arXiv preprint arXiv:2010.05332, 2020.
  30. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  31. Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35:25278–25294, 2022.
  32. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114, 2021.
  33. Dear: Debiasing vision-language models with additive residuals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6820–6829, 2023.
  34. Finetuning text-to-image diffusion models for fairness. arXiv preprint arXiv:2311.07604, 2023.
  35. Evaluating debiasing techniques for intersectional biases. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2492–2498, Online and Punta Cana, Dominican Republic, Nov. 2021. Association for Computational Linguistics.
  36. Concept algebra for text-controlled vision models. arXiv preprint arXiv:2302.03693, 2023.
  37. Fairgan: Fairness-aware generative adversarial networks. In 2018 IEEE International Conference on Big Data (Big Data), pages 570–575. IEEE, 2018.
  38. Iti-gen: Inclusive text-to-image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3969–3980, 2023.
  39. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv:1804.06876, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Hidir Yesiltepe (9 papers)
  2. Kiymet Akdemir (4 papers)
  3. Pinar Yanardag (34 papers)
Citations (2)