Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

41 tokens/sec

GPT-4o

59 tokens/sec

Gemini 2.5 Pro Pro

41 tokens/sec

o3 Pro

7 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

93 5 5 41

Dialect prejudice predicts AI decisions about people's character, employability, and criminality (2403.00742v1)

Published 1 Mar 2024 in cs.CL, cs.AI, and cs.CY

Abstract: Hundreds of millions of people now interact with LLMs, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these LLMs are known to perpetuate systematic racial prejudices, making their judgments biased in problematic ways about groups like African Americans. While prior research has focused on overt racism in LLMs, social scientists have argued that racism with a more subtle character has developed over time. It is unknown whether this covert racism manifests in LLMs. Here, we demonstrate that LLMs embody covert racism in the form of dialect prejudice: we extend research showing that Americans hold raciolinguistic stereotypes about speakers of African American English and find that LLMs have the same prejudice, exhibiting covert stereotypes that are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement. By contrast, the LLMs' overt stereotypes about African Americans are much more positive. We demonstrate that dialect prejudice has the potential for harmful consequences by asking LLMs to make hypothetical decisions about people, based only on how they speak. LLMs are more likely to suggest that speakers of African American English be assigned less prestigious jobs, be convicted of crimes, and be sentenced to death. Finally, we show that existing methods for alleviating racial bias in LLMs such as human feedback training do not mitigate the dialect prejudice, but can exacerbate the discrepancy between covert and overt stereotypes, by teaching LLMs to superficially conceal the racism that they maintain on a deeper level. Our findings have far-reaching implications for the fair and safe employment of language technology.

PDF HTML Abstract

Exploring the Covert Racial Bias in AI LLMs through Dialect Prejudice

Introduction

Recent advancements in NLP have seen the blossoming of various applications of LLMs (LMs), ranging from writing aids to tools informing employment decisions. With such widespread utility comes the crucial question of bias in AI systems, especially racial bias, which has been documented in cases relating to African American English (AAE). While extensive research exists on overt racial prejudice in LLMs, the subtle nuances of covert racism, especially in the form of dialect prejudice, have not been fully explored. This paper presents an empirical investigation into dialect prejudice in LLMs, revealing a bias in AI decisions based on dialects indicative of a speaker's racial background.

The focus is on the extent to which LLMs embed covert racism by exhibiting bias against the AAE dialect, a component of covert racism.

Approach

This paper employs Matched Guise Probing, which adapts the matched guise technique from sociolinguistics to the written domain, enabling an examination of the biases held by LMs against texts written in AAE compared to Standard American English (SAE). The approach embeds AAE or SAE text in prompts, asking the LMs to make assumptions about the speaker's character, employability, and criminality without overt references to race. This strategy probes the covert stereotypes within LMs by focusing on dialect features rather than explicit racial identifiers.

Illustrating this through different experiments, the paper highlights that LLMs, including GPT-2, GPT-3.5, GPT-4, RoBERTa, and T5, consistently assign more negative attributes and outcomes to AAE speakers. This unveils a striking discrepancy between the overtly positive attributes associated with African Americans and the covert negative stereotypes triggered by the AAE dialect in these models.

Study 1: Covert Stereotypes in LLMs

Matching the setup of the Princeton Trilogy studies on racial stereotypes, the research uncovers that LLMs align more closely with archaic human stereotypes from before the civil rights movement. This suggests that LMs covertly harbor the most negative stereotypes about African Americans ever experimentally recorded, contrary to the more positive overt assertions about African Americans typically generated by these models.

Study 2: Impact of Covert Stereotypes on AI Decisions

Exploring the real-world implications of dialect prejudice, the paper demonstrates that LMs are more likely to associate speakers of AAE with less prestigious jobs, criminal convictions, and even death penalties. These outcomes reflect a significant bias and the potential for substantial harm when language technology is applied in critical domains like employment and law enforcement.

Study 3: Resolvability of Dialect Prejudice

Analyzing potential mitigation strategies such as scaling model size and training with human feedback, the research finds that neither approach effectively reduces the observed dialect prejudice. Surprisingly, larger models and those trained with human feedback exhibit greater covert racial prejudice, suggesting that current methods for bias mitigation may not address the subtleties of covert racism in LLMs.

Discussion

The findings of this paper underscore a deep-seated issue of covert racism manifesting through dialect prejudice within current LLMs. This reflects not only the biases present in the underlying training data but also the complex nature of societal racial attitudes that these models inadvertently learn and perpetuate. As AI continues to integrate into various societal sectors, addressing these covert prejudices becomes crucial for developing equitable and unbiased AI systems.

Conclusion

This paper has shed light on the covert racial biases present in LLMs, particularly through the lens of dialect prejudice. By revealing the extent to which current LMs associate negative stereotypes and outcomes with AAE, it calls for a deeper examination of bias in AI and the development of more sophisticated approaches to mitigate racial prejudice in language technology.

PDF Markdown Bookmark Chat (Pro)

References (109)

Authors (4)

Valentin Hofmann (21 papers)
Pratyusha Ria Kalluri (2 papers)
Dan Jurafsky (118 papers)
Sharese King (1 paper)

Citations (24)

View on Semantic Scholar

GitHub

GitHub - valentinhofmann/dialect-prejudice (35 stars)

Tweets

https://twitter.com/vjhofmann/status/1764687457302253783

https://twitter.com/venikunche/status/1767621236589220111

https://twitter.com/mario_gug/status/1765083670594121900

https://twitter.com/AMadmanNotABox/status/1822237600322658404

https://twitter.com/bobehayes/status/1766713094607122554

https://twitter.com/z_jaggers/status/1782824805311607015

YouTube

Show All Videos

HackerNews

Dialect prejudice predicts AI decisions about people's character (4 points, 2 comments)
Dialect predicts AI decisions about character, employability, and criminality (1 point, 1 comment)