Stable Bias: Analyzing Societal Representations in Diffusion Models (2303.11408v2)

Published 20 Mar 2023 in cs.CY

Abstract: As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall-E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems.

PDF Abstract

Stable Bias: Evaluating Societal Representations in Diffusion Models

The paper "Stable Bias: Evaluating Societal Representations in Diffusion Models" explores the intricacies of societal bias within text-to-image (TTI) systems, particularly diffusion models. In recent years, diffusion-based approaches have emerged as a powerful methodology in generating prompted images. However, the underlying biases within these systems necessitate a thorough examination, given their potential to propagate societal stereotypes.

The authors propose a novel analytical framework to diagnose biases in TTI models. This methodology focuses on the variability in generated images concerning gender and ethnicity markers within user prompts and examines this variability against the backdrop of different professions. They utilized three prominent TTI systems—Stable Diffusion versions 1.4 and 2, and another unnamed model—to conduct their analysis.

Key Methodologies and Findings

Prompt Construction and Image Generation: The research utilizes prompts that incorporate specific gender and ethnicity markers alongside professional titles. This approach generates diverse image datasets enabling an evaluation of the inherent biases reflected in the output images of TTI systems.
Text and Visual Feature Analysis: The analysis is bifurcated into text-based interpretations through vision-LLMs and visual clustering methods leveraging dense visual embeddings. This dual approach permits a comprehensive assessment of the models' representations.
Visual Diversity and Bias Detection: A striking outcome from the examination is the evident under-representation of specific demographic groups across professional categories. Particularly, images associated with prompts featuring minority groups tend to diverge from demographic statistics reported by legitimate sources such as the US Bureau of Labor Statistics.
Model Comparisons: Stable Diffusion v1.4 is observed to depict marginally better diversity compared to its successor version (v2) and the unnamed model. The models consistently under-attribute societal roles to marginalized identities, showcasing pronounced biases in profession-related visual generations.

Implementation and Tools

A significant contribution of this work is the development of interactive tools to inspect biases qualitatively. These tools, including the Diffusion Bias Explorer and Average Face Comparison Tool, empower users to interactively examine generated images for bias patterns and provide qualitative insights beyond static statistical analysis.

Implications and Future Directions

The research elucidates the critical need to address biases in TTI systems before their wide deployment in applications like graphic design and media. The implications extend beyond academic interest into societal impact, where biased representations can influence public perceptions and notions of professional identity related to ethnicity and gender.

Furthermore, this paper lays down a foundational methodology for evaluating and mitigating biases in TTI systems, encouraging future exploration along additional demographic dimensions such as age and religious markers.

The authors acknowledge limitations in coverage and emphasize the necessity of expanding this research to incorporate a broader cultural and demographic context. The paper and its analytical tools serve as pivotal resources for ongoing efforts in making TTI systems more representative and equitable, thus promoting fairness within AI-generated content.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Alexandra Sasha Luccioni (25 papers)
Christopher Akiki (15 papers)
Margaret Mitchell (43 papers)
Yacine Jernite (46 papers)

Citations (139)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/TheTuringPost/status/1783269716636889096

YouTube

Show All Videos