Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating Large Language Models through Gender and Racial Stereotypes (2311.14788v1)

Published 24 Nov 2023 in cs.CL, cs.AI, and cs.CY
Evaluating Large Language Models through Gender and Racial Stereotypes

Abstract: LLMs have ushered a new age of AI gaining traction within the NLP community as well as amongst the general population. AI's ability to make predictions, generations and its applications in sensitive decision-making scenarios, makes it even more important to study these models for possible biases that may exist and that can be exaggerated. We conduct a quality comparative study and establish a framework to evaluate LLMs under the premise of two kinds of biases: gender and race, in a professional setting. We find out that while gender bias has reduced immensely in newer models, as compared to older ones, racial bias still exists.

Introduction

LLMs have become integral to various applications in sensitive decision-making scenarios, making the presence of biases within them a significant concern. Two primary biases evaluated in this research are gender and race within a professional context. These biases could potentially affect outcomes and perpetuate societal stereotypes if not addressed effectively. The paper utilizes a dataset of 99 professions to assess whether models exhibit biases when assigning gender or race to these professions. While gender bias appears to be on the decline, racial bias still persists in LLMs.

Methodology

The paper employs a two-pronged approach: one for gender and another for racial bias. Gender bias is tested by tasking models with assigning a gender to different professions, comparing the results against human-annotated ground truth. The evaluation covers both older models (like BERT, GPT-2) and newer ones (like GPT-3.5 and Claude). Racial bias is assessed by generating descriptions for individuals of various races in different professions and analyzing the responses for stereotypes. The paper operationalizes societal biases as varied accuracies in judgment based on gender, race, and social status.

Gender Analysis

Investigating gender bias, the paper finds that newer models like GPT-3.5 show improvement over older versions, with a substantial reduction in gender bias. However, challenges remain as models like Flan-T5 exhibit significant biases, failing to embrace recent shifts towards gender neutrality in professions. Metrics such as bias score are used to compare model performances, showing that GPT-3.5 exhibits the least bias among the evaluated models. The research highlights that while advancements are evident, the path to completely unbiased AI representations of gender in professions is still unfolding.

Race Analysis

In assessing racial bias, GPT-3.5 generates descriptions that adhere to stereotypes for different races across various professions. By measuring the similarity of responses and employing a Linguistic Inquiry and Word Count (LIWC) analysis, the paper shows noticeable differences in the emotional, social, and work-related attributes ascribed to different races. These inconsistencies reveal implicit biases where certain races are depicted with more emotive descriptors or differing attitudes towards work and social interactions.

Conclusion

The evaluation framework developed and applied in this paper demonstrates that, despite improvements, LLMs such as GPT-3.5 still exhibit biases related to gender and race. The research underlines the importance of continued efforts to mitigate these biases, suggesting that future studies could broaden the analysis to include other models and evaluate the impact of biases on human behavior more directly. The paper contributes to the critical discourse on creating fairer AI systems by providing a method to identify and measure the subtle prejudices that could influence real-world decisions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Unmasking Contextual Stereotypes: Measuring and Mitigating BERT’s Gender Bias. CoRR abs/2010.14534 (2020). arXiv:2010.14534 https://arxiv.org/abs/2010.14534
  2. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]
  3. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (14 April 2017), 183–186. https://doi.org/10.1126/science.aal4230
  4. Scaling Instruction-Finetuned Language Models. arXiv:2210.11416 [cs.LG]
  5. Devah and Shepherd. 1997. Racial, societal and class bias in clinical judgement. arXiv:https://doi.org/10.1111/j.1468-2850.1997.tb00104.x
  6. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, Vol. 1. 4171–4186. www.scopus.com Cited By :6289.
  7. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs.CL]
  8. Science faculty’s subtle gender biases favor male students. Proceedings of the National Academy of Sciences 109, 41 (2012), 16474–16479. https://doi.org/10.1073/pnas.1211286109 arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1211286109
  9. Muhammad Ali Pervez. 2010. Impact of Emotions on Employee’s Job Performance: An Evidence from Organizations of Pakistan. arXiv:https://ssrn.com/abstract=1668170
  10. Language Models are Unsupervised Multitask Learners.
  11. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv:1910.10683 [cs.LG]
  12. Exploring Perception Of Professionals Regarding Introversion And Extroversion In Relation To Success At Workplace. 7 (01 2021).
  13. Hanvold T.N. Sterud T. 2021. Effects of adverse social behaviour at the workplace on subsequent mental distress: a 3-year prospective study of the general working population in Norway. arXiv:https://doi.org/10.1007/s00420-020-01581-y
  14. LaMDA: Language Models for Dialog Applications. arXiv:2201.08239 [cs.CL]
  15. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv:1906.08237 [cs.CL]
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Ananya Malik (2 papers)
Citations (2)