Exploring the Covert Racial Bias in AI LLMs through Dialect Prejudice
Introduction
Recent advancements in NLP have seen the blossoming of various applications of LLMs (LMs), ranging from writing aids to tools informing employment decisions. With such widespread utility comes the crucial question of bias in AI systems, especially racial bias, which has been documented in cases relating to African American English (AAE). While extensive research exists on overt racial prejudice in LLMs, the subtle nuances of covert racism, especially in the form of dialect prejudice, have not been fully explored. This paper presents an empirical investigation into dialect prejudice in LLMs, revealing a bias in AI decisions based on dialects indicative of a speaker's racial background.
The focus is on the extent to which LLMs embed covert racism by exhibiting bias against the AAE dialect, a component of covert racism.
Approach
This paper employs Matched Guise Probing, which adapts the matched guise technique from sociolinguistics to the written domain, enabling an examination of the biases held by LMs against texts written in AAE compared to Standard American English (SAE). The approach embeds AAE or SAE text in prompts, asking the LMs to make assumptions about the speaker's character, employability, and criminality without overt references to race. This strategy probes the covert stereotypes within LMs by focusing on dialect features rather than explicit racial identifiers.
Illustrating this through different experiments, the paper highlights that LLMs, including GPT-2, GPT-3.5, GPT-4, RoBERTa, and T5, consistently assign more negative attributes and outcomes to AAE speakers. This unveils a striking discrepancy between the overtly positive attributes associated with African Americans and the covert negative stereotypes triggered by the AAE dialect in these models.
Study 1: Covert Stereotypes in LLMs
Matching the setup of the Princeton Trilogy studies on racial stereotypes, the research uncovers that LLMs align more closely with archaic human stereotypes from before the civil rights movement. This suggests that LMs covertly harbor the most negative stereotypes about African Americans ever experimentally recorded, contrary to the more positive overt assertions about African Americans typically generated by these models.
Study 2: Impact of Covert Stereotypes on AI Decisions
Exploring the real-world implications of dialect prejudice, the paper demonstrates that LMs are more likely to associate speakers of AAE with less prestigious jobs, criminal convictions, and even death penalties. These outcomes reflect a significant bias and the potential for substantial harm when language technology is applied in critical domains like employment and law enforcement.
Study 3: Resolvability of Dialect Prejudice
Analyzing potential mitigation strategies such as scaling model size and training with human feedback, the research finds that neither approach effectively reduces the observed dialect prejudice. Surprisingly, larger models and those trained with human feedback exhibit greater covert racial prejudice, suggesting that current methods for bias mitigation may not address the subtleties of covert racism in LLMs.
Discussion
The findings of this paper underscore a deep-seated issue of covert racism manifesting through dialect prejudice within current LLMs. This reflects not only the biases present in the underlying training data but also the complex nature of societal racial attitudes that these models inadvertently learn and perpetuate. As AI continues to integrate into various societal sectors, addressing these covert prejudices becomes crucial for developing equitable and unbiased AI systems.
Conclusion
This paper has shed light on the covert racial biases present in LLMs, particularly through the lens of dialect prejudice. By revealing the extent to which current LMs associate negative stereotypes and outcomes with AAE, it calls for a deeper examination of bias in AI and the development of more sophisticated approaches to mitigate racial prejudice in language technology.