Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

41 tokens/sec

GPT-4o

60 tokens/sec

Gemini 2.5 Pro Pro

44 tokens/sec

o3 Pro

8 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

106

Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation (2404.01030v3)

Published 1 Apr 2024 in cs.CV, cs.AI, and cs.CY

Abstract: The recent advancement of large and powerful models with Text-to-Image (T2I) generation abilities -- such as OpenAI's DALLE-3 and Google's Gemini -- enables users to generate high-quality images from textual prompts. However, it has become increasingly evident that even simple prompts could cause T2I models to exhibit conspicuous social bias in generated images. Such bias might lead to both allocational and representational harms in society, further marginalizing minority groups. Noting this problem, a large body of recent works has been dedicated to investigating different dimensions of bias in T2I systems. However, an extensive review of these studies is lacking, hindering a systematic understanding of current progress and research gaps. We present the first extensive survey on bias in T2I generative models. In this survey, we review prior studies on dimensions of bias: Gender, Skintone, and Geo-Culture. Specifically, we discuss how these works define, evaluate, and mitigate different aspects of bias. We found that: (1) while gender and skintone biases are widely studied, geo-cultural bias remains under-explored; (2) most works on gender and skintone bias investigated occupational association, while other aspects are less frequently studied; (3) almost all gender bias works overlook non-binary identities in their studies; (4) evaluation datasets and metrics are scattered, with no unified framework for measuring biases; and (5) current mitigation methods fail to resolve biases comprehensively. Based on current limitations, we point out future research directions that contribute to human-centric definitions, evaluations, and mitigation of biases. We hope to highlight the importance of studying biases in T2I systems, as well as encourage future efforts to holistically understand and tackle biases, building fair and trustworthy T2I technologies for everyone.

PDF HTML Abstract

Survey of Bias in Text-to-Image Generation: Definition, Evaluation, and Mitigation

Introduction

The research under discussion provides a comprehensive overview of bias within Text-to-Image (T2I) generative systems, a field of paper that has rapidly gained attention with the advancement of models like OpenAI's DALLE-3 and Google's Gemini. While these models promise a vast array of applications, they also raise significant concerns about bias, aligning with broader societal issues related to gender, skintone, and geo-cultural representations. This survey is the first to extensively collate and analyze existing studies concerning bias in T2I systems, shedding light on how bias is defined, evaluated, and mitigated across different dimensions.

Bias Definitions

The paper identifies three primary dimensions of bias in T2I models:

Gender Bias, where extensive research has been conducted, reveals a significant inclination towards binary gender representations and stereotypes. Specific areas scrutinized include gender default generation, occupational association, and portrayal of characteristics, interests, stereotypes, and power dynamics.
Skintone Bias, which addresses the model's tendency to favor lighter skin tones in scenarios where skintone is unspecified. The biases extend to occupational associations and characteristics interests.
Geo-Cultural Bias reflects an under-representation or skewed portrayal of cultures, notably magnifying Western norms and stereotypes at the cost of global diversity.

Bias Evaluation

Evaluation Datasets

Different approaches toward dataset compilation are noted, from manually curated prompts to proposed datasets like CCUB and Holistic Evaluation of Text-to-Image Models (HEIM) benchmark. The adoption of predefined datasets such as LAION-5B and MS-COCO highlights the scattered nature of evaluation frameworks.

Evaluation Metrics

The paper discusses the prevalence of classification-based metrics, augmenting this with embedding-based metrics for a nuanced understanding of bias. While classification methods dominate the evaluation landscape, concerns around the reliability and ethical considerations of automated and human annotation processes are addressed.

Bias Mitigation

Mitigation strategies are broadly classified into model weight refinement and inference-time and data approaches. Despite various proposed methods—ranging from fine-tuning and model-based editing to prompt engineering and guided generation—the absence of an encompassing solution to biases is evident. The paper calls for further research into robust, adaptive, and community-informed mitigation strategies to cultivate fairer T2I systems.

Future Directions

The survey emphasizes the necessity for:

Enhanced definitions that clarify and contextualize biases,
Improved evaluation methods to measure biases accurately, considering human-centric perspectives,
Continuous development of mitigation strategies that are effective, diverse, and adaptive to evolving societal norms.

The discussion extends to ethical considerations, highlighting the importance of transparency in defining bias and the potential misuse of mitigation strategies in unjust applications.

Conclusion

This survey articulates the pressing need for an integrated approach to understanding, evaluating, and mitigating bias in T2I generative models. By categorizing existing studies and identifying gaps, it paves the way for future research aimed at developing fair, inclusive, and trustworthy T2I technologies.

PDF Markdown Bookmark Chat (Pro)

References (114)

Authors (9)

Yixin Wan (19 papers)
Arjun Subramonian (22 papers)
Anaelia Ovalle (16 papers)
Zongyu Lin (15 papers)
Ashima Suvarna (8 papers)
Christina Chance (4 papers)
Hritik Bansal (38 papers)
Rebecca Pattichis (3 papers)
Kai-Wei Chang (292 papers)

Citations (13)

View on Semantic Scholar

Tweets

https://twitter.com/yixin_wan_/status/1775580933208580139