Semantics derived automatically from language corpora contain human-like biases (1608.07187v4)

Published 25 Aug 2016 in cs.AI, cs.CL, cs.CY, and cs.LG

Abstract: Artificial intelligence and machine learning are in a period of astounding growth. However, there are concerns that these technologies may be used, either with or without intention, to perpetuate the prejudice and unfairness that unfortunately characterizes many human institutions. Here we show for the first time that human-like semantic biases result from the application of standard machine learning to ordinary language---the same sort of language humans are exposed to every day. We replicate a spectrum of standard human biases as exposed by the Implicit Association Test and other well-known psychological studies. We replicate these using a widely used, purely statistical machine-learning model---namely, the GloVe word embedding---trained on a corpus of text from the Web. Our results indicate that language itself contains recoverable and accurate imprints of our historic biases, whether these are morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the {\em status quo} for the distribution of gender with respect to careers or first names. These regularities are captured by machine learning along with the rest of semantics. In addition to our empirical findings concerning language, we also contribute new methods for evaluating bias in text, the Word Embedding Association Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). Our results have implications not only for AI and machine learning, but also for the fields of psychology, sociology, and human ethics, since they raise the possibility that mere exposure to everyday language can account for the biases we replicate here.

PDF Abstract

Semantics Derived Automatically from Language Corpora Necessarily Contain Human Biases

The paper "Semantics derived automatically from language corpora necessarily contain human biases" by Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan establishes a crucial insight into the intersection of AI and societal biases. The authors demonstrate that AI systems trained on natural language corpora can and do absorb human-like biases, which inherently reflect the cultural and societal prejudices from which the data originates.

Replication of Human Biases in AI

By employing the GloVe word embedding model, the authors show that these embeddings replicate a wide range of biases documented in human subjects through psychological studies such as the Implicit Association Test (IAT). Specifically, the paper successfully replicates biases regarding the association of flowers with pleasantness and insects with unpleasantness, and similar associations of musical instruments versus weapons.

Racial and Gender Biases

One of the significant findings of the paper is the replication of racial biases. Utilizing names identified as typically African American or European American, the paper reveals that European American names are more likely to be associated with pleasantness. This replication extends to real-world applications, where the research mirrored findings from Bertrand and Mullainathan's paper on racial bias in job callbacks, illustrating that AI systems can inherit prejudices that affect employment opportunities based on applicant names.

Moreover, the research highlights gender biases, such as the association of male names with careers and female names with family, as well as the more substantial link of female terms to arts and male terms to mathematics. These biases were derived from word embeddings, showcasing a structural issue in trained LLMs.

Methodology

The research introduces two novel methods: the Word Embedding Association Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). WEAT measures semantic associations within word embeddings analogously to IAT, while WEFAT correlates word vector associations with real-world data, such as occupation demographics. The robustness of the methodology is evidenced by its application to various biases, yielding statistically significant results with high effect sizes.

Implications and Future Directions

Practical Implications

The findings underscore the necessity of re-evaluating the deployment of AI systems in scenarios where bias can lead to discriminatory outcomes. As AI continues to integrate into critical sectors like recruitment, criminal justice, and healthcare, understanding and mitigating inherent biases becomes paramount. Transparency in AI algorithms and diverse development teams have been recommended strategies but are insufficient on their own to eliminate such deeply ingrained biases.

Theoretical Implications

From a theoretical standpoint, this paper posits that language inherently carries biases reflective of societal norms and historical contexts. This realization demands a new null hypothesis in psychology and sociology where language exposure alone may account for certain prejudices. It challenges existing models of prejudice transmission, emphasizing the role of cultural and linguistic regularities.

Future Work

Future research could focus on creating and refining methodologies to identify and reduce biases in AI systems without compromising their understanding of language semantics. This involves interdisciplinary efforts, leveraging insights from cognitive science, ethics, and AI to develop systems that can recognize prejudices and act impartially. Investigating heterogeneous AI architectures that combine machine learning with rule-based systems may offer pathways to minimizing prejudicial impacts while retaining the benefits of AI-based insights.

Conclusion

The paper by Caliskan, Bryson, and Narayanan provides an essential contribution to understanding how AI systems can passively inherit biases from human language. While this finding indicates significant challenges, it also opens avenues for research aiming to create more equitable AI technologies. Addressing these biases head-on is crucial for the ethical advancement of AI in our increasingly automated world.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Aylin Caliskan (38 papers)
Joanna J. Bryson (3 papers)
Arvind Narayanan (48 papers)

Citations (2,471)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos