Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Harm Amplification in Text-to-Image Models (2402.01787v3)

Published 1 Feb 2024 in cs.CY, cs.AI, and cs.LG

Abstract: Text-to-image (T2I) models have emerged as a significant advancement in generative AI; however, there exist safety concerns regarding their potential to produce harmful image outputs even when users input seemingly safe prompts. This phenomenon, where T2I models generate harmful representations that were not explicit in the input prompt, poses a potentially greater risk than adversarial prompts, leaving users unintentionally exposed to harms. Our paper addresses this issue by formalizing a definition for this phenomenon which we term harm amplification. We further contribute to the field by developing a framework of methodologies to quantify harm amplification in which we consider the harm of the model output in the context of user input. We then empirically examine how to apply these different methodologies to simulate real-world deployment scenarios including a quantification of disparate impacts across genders resulting from harm amplification. Together, our work aims to offer researchers tools to comprehensively address safety challenges in T2I systems and contribute to the responsible deployment of generative AI models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (94)
  1. Building the epistemic community of ai safety, 2023.
  2. Open AI. Dall-e 2, 2023. URL https://openai.com/dall-e-2/.
  3. Stability AI. Stable diffusion, 2022. URL https://stablediffusionweb.com/.
  4. Saudi women driving: Images, stereotyping and digital media. Visual Communication, 22(1):96–127, 2023. doi: 10.1177/14703572211040851.
  5. Jessica Quaye Charvi Rastogi Max Bartolo Oana Inel Juan Ciro Rafael Mosquera Addison Howard Will Cukierski D. Sculley Vijay Janapa Reddi Lora Aroya Alicia Parish, Hannah Rose Kirk. Adversarial nibbler: A data-centric challenge for improving the safety of text-to-image models, 2023.
  6. Gloria Anzaldúa. Borderlands/La Frontera. Aunt Lute Books, San Francisco, California, 1987.
  7. "how well can text-to-image generative models understand ethical natural language interventions?". In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1358–1370, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.emnlp-main.88. URL https://aclanthology.org/2022.emnlp-main.88.
  8. The problem with bias: Allocative versus representational harms in machine learning. In 9th Annual Conference of the Special Interest Group for Computing, Information and Society, Philadelphia, PA, USA, 2017. Society for the History of Technology.
  9. Fairness and machine learning, 2019. http://www.fairmlbook.org.
  10. #seeitbeit: What families are seeing on tv. The Geena Davis Institute on Gender in Media, 2022. URL https://seejane.org/research-informs-empowers/see-it-be-it-what-families-are-watching-on-tv/.
  11. Ruha Benjamin. Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press, Cambridge, UK, Jul 2019. ISBN 9781509526437.
  12. Cynthia L. Bennett and Os Keyes. What is the point of fairness? disability, ai and the complexity of justice. SIGACCESS Access. Comput., 125(5), 2020. ISSN 1558-2337. doi: 10.1145/3386296.3386301. URL https://doi.org/10.1145/3386296.3386301.
  13. “it’s complicated”: Negotiating accessibility and (mis)representation in image descriptions of race, gender, and disability. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450380966. doi: 10.1145/3411764.3445498. URL https://doi.org/10.1145/3411764.3445498.
  14. Easily accessible text-to-image generation amplifies demographic stereotypes at large scale. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 1493–1504, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701924. doi: 10.1145/3593013.3594095. URL https://doi.org/10.1145/3593013.3594095.
  15. Typology of risks of generative text-to-image models. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23, page 396–410, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400702310. doi: 10.1145/3600211.3604722. URL https://doi.org/10.1145/3600211.3604722.
  16. Multimodal datasets: Misogyny, pornography, and malignant stereotypes, 2021.
  17. Into the laions den: Investigating hate in multimodal datasets, 2023.
  18. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81 of Proceedings of Machine Learning Research, pages 77–91, New York, NY, 23–24 Feb 2018. PMLR. URL https://proceedings.mlr.press/v81/buolamwini18a.html.
  19. Dall-eval: Probing the reasoning skills and social biases of text-to-image generation models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3043–3054, Paris, France, October 2023. IEEE.
  20. Fair generative modeling via weak supervision, 2020.
  21. Patricia Hill Collins. Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment. Routledge, New York, New York, 1990.
  22. Sasha Costanza-Chock. Design justice, ai, and escape from the matrix of domination. Journal of Design and Science, 3(5):1–14, 2018.
  23. Kimberle Crenshaw. Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review, 43(6):1241–1299, 1991a.
  24. Kimberlé Crenshaw. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory, and antiracist politics. In Feminist Legal Theories: Readings in Law and Gender, pages 23–51. Westview Press, Boulder, Colorado, 1991b.
  25. The expertise problem: Learning from specialized feedback, 2022.
  26. Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations. Transactions of the Association for Computational Linguistics, 10:92–110, 01 2022. ISSN 2307-387X. doi: 10.1162/tacl_a_00449. URL https://doi.org/10.1162/tacl_a_00449.
  27. Whose ground truth? accounting for individual and collective identities underlying dataset annotation, 2021.
  28. Harms of gender exclusivity and challenges in non-binary representation in language technologies, 2021.
  29. Thiago Dias Oliva. Content Moderation Technologies: Applying Human Rights Standards to Protect Freedom of Expression. Human Rights Law Review, 20(4):607–640, 12 2020. ISSN 1461-7781. doi: 10.1093/hrlr/ngaa032. URL https://doi.org/10.1093/hrlr/ngaa032.
  30. Roel Dobbe. System safety and artificial intelligence. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 1584, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522. doi: 10.1145/3531146.3533215. URL https://doi.org/10.1145/3531146.3533215.
  31. Sound framework: Analyzing (so)cial representation in (un)structured (d)ata, 2023.
  32. Equalizing gender biases in neural machine translation with word embeddings techniques, 2019.
  33. Objectification theory: Toward understanding women’s lived experiences and mental health risks. Psychology of Women Quarterly, 21(2):173–206, 1997.
  34. Maria and beto are sexist: Evaluating gender bias in large language models for spanish, 2023.
  35. Lisa Gitelman. "Raw Data" Is an Oxymoron. The MIT Press, Cambridge, Massachusetts, 2013.
  36. Algorithmic realism: Expanding the boundaries of algorithmic thought. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* ’20, page 19–31, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369367. doi: 10.1145/3351095.3372840. URL https://doi.org/10.1145/3351095.3372840.
  37. A systematic study of bias amplification, 2022a.
  38. Bias amplification in image classification. In Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022, pages 1–16, New Orleans, LA, 2022b. NeurIPS. URL https://openreview.net/forum?id=lwG9sG4sbVC.
  39. Safety and fairness for content moderation in generative models, 2023.
  40. Midjourney Inc. Midjourney, 2022. URL https://www.midjourney.com/.
  41. Imperfect imaganation: Implications of gans exacerbating biases on facial data augmentation and snapchat selfie lenses, 2021.
  42. Representational harms in image tagging. beyond fair computer vision workshop at cvpr 2021 (2021), 2021.
  43. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI ’15, page 3819–3828, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450331456. doi: 10.1145/2702123.2702520. URL https://doi.org/10.1145/2702123.2702520.
  44. Deborah K King. Multiple jeopardy, multiple consciousness: The context of a black feminist ideology. Signs: Journal of Women in Culture and Society, 14(1):42–72, 1988.
  45. Bias out-of-the-box: An empirical analysis of intersectional occupational biases in popular generative language models. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 2611–2624, Virtual Event, 2021. Curran Associates, Inc. URL https://proceedings.neurips.cc/paper/2021/file/1531beb762df4029513ebf9295e0d34f-Paper.pdf.
  46. Show me a "male nurse"! how gender bias is reflected in the query formulation of search engine users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9781450394215. doi: 10.1145/3544548.3580863. URL https://doi.org/10.1145/3544548.3580863.
  47. Grep-biasir: A dataset for investigating gender representation bias in information retrieval results. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR ’23, page 444–448, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400700354. doi: 10.1145/3576840.3578295. URL https://doi.org/10.1145/3576840.3578295.
  48. Feature-wise bias amplification, 2019.
  49. Nancy Leveson. Are you sure your software will not kill anyone? Commun. ACM, 63(2):25–28, jan 2020. ISSN 0001-0782. doi: 10.1145/3376127. URL https://doi.org/10.1145/3376127.
  50. Nancy G Leveson. Engineering a Safer World: Systems Thinking Applied to Safety. The MIT Press, Cambridge, MA, 2016.
  51. Kirsten Lloyd. Bias amplification in artificial intelligence systems, 2018.
  52. Judith Lorber. Paradoxes of Gender. Yale University Press, New Haven, Connecticut, 1994.
  53. Judith Lorber. Beyond the binaries: Depolarizing the categories of sex, sexuality, and gender. Sociological Inquiry, 66(2):143–160, 1996.
  54. Stable bias: Analyzing societal representations in diffusion models, 2023.
  55. Fabian Lütz. Gender Equality and Artificial Intelligence: SDG 5 and the Role of the UN in Fighting Stereotypes, Biases, and Gender Discrimination, pages 153–180. Palgrave Macmillan, Cham, Switzerland, 2023.
  56. Patricia Yancey Martin. Gender as Social Institution*. Social Forces, 82(4):1249–1273, 06 2004. ISSN 0037-7732. doi: 10.1353/sof.2004.0081. URL https://doi.org/10.1353/sof.2004.0081.
  57. A survey on bias and fairness in machine learning. ACM Comput. Surv., 54(6), jul 2021. ISSN 0360-0300. doi: 10.1145/3457607. URL https://doi.org/10.1145/3457607.
  58. Gender artifacts in visual datasets. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4837–4848, Paris,France, 2023. IEEE.
  59. “i don’t think these devices are very culturally sensitive.”—impact of automated speech recognition errors on african americans. Frontiers in Artificial Intelligence, 26(4):169, 2021. doi: https://doi.org/10.3389/frai.2021.725911.
  60. Chandra Talpade Mohanty. Feminism Without Borders: Decolonizing Theory, Practicing Solidarity. Duke University Press, Durham, NC, 2003.
  61. Social biases through the text-to-image generation lens. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23, page 786–808, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400702310. doi: 10.1145/3600211.3604711. URL https://doi.org/10.1145/3600211.3604711.
  62. Reducing gender bias in abusive language detection, 2018.
  63. Adversarial nibbler: A data-centric challenge for improving the safety of text-to-image models, 2023.
  64. Analyzing bias in diffusion-based face generation models, 2023.
  65. Whitney Phillips. The oxygen of amplification: Better practices for reporting on extremists, antagonists, and manipulators online. Data & Society, 2018. URL https://datasociety.net/wp-content/uploads/2018/05/2-PART-2_Oxygen_of_Amplification_DS.pdf.
  66. Ai’s regimes of representation: A community-centered study of text-to-image models in south asia. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 506–517, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701924. doi: 10.1145/3593013.3594016. URL https://doi.org/10.1145/3593013.3594016.
  67. Unsafe diffusion: On the generation of unsafe images and hateful memes from text-to-image models. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS ’23, page 3403–3417, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400700507. doi: 10.1145/3576915.3616679. URL https://doi.org/10.1145/3576915.3616679.
  68. Learning transferable visual models from natural language supervision. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8748–8763, Virtual Event, 18–24 Jul 2021. PMLR. URL https://proceedings.mlr.press/v139/radford21a.html.
  69. Red-teaming the stable diffusion safety filter, 2022.
  70. Barbara J Risman. Gender as a social structure: Theory wrestling with activism. Gender & Society, 18(4):429–450, 2004.
  71. Beyond the ml model: Applying safety engineering frameworks to text-to-image development. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23, page 70–83, New York, NY, USA, 2023a. Association for Computing Machinery. ISBN 9798400702310. doi: 10.1145/3600211.3604685. URL https://doi.org/10.1145/3600211.3604685.
  72. From plane crashes to algorithmic harm: Applicability of safety engineering frameworks for responsible ml. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9781450394215. doi: 10.1145/3544548.3581407. URL https://doi.org/10.1145/3544548.3581407.
  73. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, New Orleans, LA, June 2022. IEEE.
  74. Photorealistic text-to-image diffusion models with deep language understanding. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 36479–36494, New Orleans, LA, 2022. Curran Associates, Inc. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf.
  75. Exploring social bias in downstream applications of text-to-image foundation models, 2023.
  76. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 22522–22531, New Orleans, LA, June 2023. IEEE.
  77. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23, page 723–741, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400702310. doi: 10.1145/3600211.3604673. URL https://doi.org/10.1145/3600211.3604673.
  78. Irene Solaiman. The gradient of generative ai release: Methods and considerations. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 111–122, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701924. doi: 10.1145/3593013.3593981. URL https://doi.org/10.1145/3593013.3593981.
  79. Evaluating the social impact of generative ai systems in systems and society, 2023.
  80. Latanya Sweeney. Discrimination in online ad delivery. Communications of the ACM, 56(5):44–54, 2013.
  81. Sexual objectification of women: Advances to theory and research 1ψ𝜓\psiitalic_ψ7. The Counseling Psychologist, 39(1):6–38, 2011.
  82. Data feedback loops: Model-driven amplification of dataset biases. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 33883–33920, Honolulu, HI, 23–29 Jul 2023. PMLR. URL https://proceedings.mlr.press/v202/taori23a.html.
  83. Stereotypes and smut: The (mis)representation of non-cisgender identities by text-to-image models, 2023.
  84. Getting gender right in neural machine translation. In Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii, editors, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3003–3008, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. doi: 10.18653/v1/D18-1334. URL https://aclanthology.org/D18-1334.
  85. Directional bias amplification. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 10882–10893, Virtual Event, 18–24 Jul 2021. PMLR. URL https://proceedings.mlr.press/v139/wang21t.html.
  86. Measuring representational harms in image captioning. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 324–335, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522. doi: 10.1145/3531146.3533099. URL https://doi.org/10.1145/3531146.3533099.
  87. T2IAT: measuring valence and stereotypical biases in text-to-image generation. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pages 2560–2574, Toronto, Canada, 2023. Association for Computational Linguistics. doi: 10.18653/V1/2023.FINDINGS-ACL.160. URL https://doi.org/10.18653/v1/2023.findings-acl.160.
  88. Sociotechnical safety evaluation of generative ai systems, 2023.
  89. Doing difference. Gender & Society, 9(1):8–37, 1995.
  90. Doing gender. Gender & Society, 1(2):125–151, 1987.
  91. Contrastive language-vision ai models pretrained on web-scraped multimodal data exhibit sexual objectification bias. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 1174–1185, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701924. doi: 10.1145/3593013.3594072. URL https://doi.org/10.1145/3593013.3594072.
  92. Scaling autoregressive models for content-rich text-to-image generation, 2022.
  93. Men also do laundry: Multi-attribute bias amplification. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 42000–42017, Honolulu, HI, 23–29 Jul 2023. PMLR. URL https://proceedings.mlr.press/v202/zhao23a.html.
  94. Men also like shopping: Reducing gender bias amplification using corpus-level constraints, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Susan Hao (5 papers)
  2. Renee Shelby (12 papers)
  3. Yuchi Liu (13 papers)
  4. Hansa Srinivasan (6 papers)
  5. Mukul Bhutani (8 papers)
  6. Burcu Karagol Ayan (10 papers)
  7. Shivani Poddar (7 papers)
  8. Sarah Laszlo (6 papers)
  9. Ryan Poplin (6 papers)
Citations (6)