Evaluating LLMs for Gender Disparities in Notable Persons (2403.09148v1)
Abstract: This study examines the use of LLMs for retrieving factual information, addressing concerns over their propensity to produce factually incorrect "hallucinated" responses or to altogether decline to even answer prompt at all. Specifically, it investigates the presence of gender-based biases in LLMs' responses to factual inquiries. This paper takes a multi-pronged approach to evaluating GPT models by evaluating fairness across multiple dimensions of recall, hallucinations and declinations. Our findings reveal discernible gender disparities in the responses generated by GPT-3.5. While advancements in GPT-4 have led to improvements in performance, they have not fully eradicated these gender disparities, notably in instances where responses are declined. The study further explores the origins of these disparities by examining the influence of gender associations in prompts and the homogeneity in the responses.
- Social biases in nlp models as barriers for persons with disabilities. arXiv preprint arXiv:2005.00813, 2020.
- A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol., jan 2024. ISSN 2157-6904. doi: 10.1145/3641289. URL https://doi.org/10.1145/3641289. Just Accepted.
- Gender bias in wikipedia and britannica. International Journal of Communication, 5:21, 2011.
- GloVe: Global vectors for word representation. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar, October 2014. Association for Computational Linguistics. doi: 10.3115/v1/D14-1162. URL https://aclanthology.org/D14-1162.
- Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.
- Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models, 2023.
- Bertscore: Evaluating text generation with bert, 2020.
- True: Re-evaluating factual consistency evaluation, 2022.
- Truthfulqa: Measuring how models mimic human falsehoods, 2022.
- How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438, 2020.
- A survey on automated fact-checking. Transactions of the Association for Computational Linguistics, 10:178–206, 2022. doi: 10.1162/tacl_a_00454. URL https://aclanthology.org/2022.tacl-1.11.
- Pouya Pezeshkpour. Measuring and modifying factual knowledge in large language models, 2023.
- Understanding factuality in abstractive summarization with FRANK: A benchmark for factuality metrics. In Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou, editors, Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4812–4829, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.383. URL https://aclanthology.org/2021.naacl-main.383.
- Stereoset: Measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:2004.09456, 2020.
- Fairness in machine learning. Nips tutorial, 1:2017, 2017.
- The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810, 2018.
- Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In International conference on machine learning, pages 2564–2572. PMLR, 2018.
- A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1–35, 2021.
- Probing explicit and implicit gender bias through llm conditional text generation. arXiv preprint arXiv:2311.00306, 2023.
- " kelly is a warm person, joseph is a role model": Gender biases in llm-generated reference letters. arXiv preprint arXiv:2310.09219, 2023.
- Assessing cross-cultural alignment between chatgpt and human societies: An empirical study, 2023.
- The political ideology of conversational ai: Converging evidence on chatgpt’s pro-environmental, left-libertarian orientation, 2023.
- Bold: Dataset and metrics for measuring biases in open-ended language generation. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21. ACM, March 2021. doi: 10.1145/3442188.3445924. URL http://dx.doi.org/10.1145/3442188.3445924.
- Exploring ai ethics of chatgpt: A diagnostic analysis. arXiv preprint arXiv:2301.12867, 2023.
- Are emily and greg more employable than lakisha and jamal? a field experiment on labor market discrimination. American economic review, 94(4):991–1013, 2004.
- Reducing discrimination with reviews in the sharing economy: Evidence from field experiments on airbnb. Management Science, 66(3):1071–1094, 2020.
- Racial discrimination in the sharing economy: Evidence from a field experiment. American economic journal: applied economics, 9(2):1–22, 2017.
- Racial and gender discrimination in transportation network companies. Technical report, National Bureau of Economic Research, 2016.
- Who are you and what are you selling? creatorbased and product-based racial cues in crowdfunding. MIS Quarterly, 46(4), 2022.
- Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pages 298–306, 2021.
- Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183–186, 2017.
- Pipelines for social bias testing of large language models. In Proceedings of BigScience Episode# 5–Workshop on Challenges & Perspectives in Creating Large Language Models. Association for Computational Linguistics, 2022.
- Emilio Ferrara. Should chatgpt be biased? challenges and risks of bias in large language models. arXiv preprint arXiv:2304.03738, 2023.
- Investigating gender bias in language models using causal mediation analysis. Advances in neural information processing systems, 33:12388–12401, 2020.
- Racial categories in machine learning. In Proceedings of the conference on fairness, accountability, and transparency, pages 289–298, 2019.
- Citations, age, fame, and the web. The Journal of Legal Studies, 29(S1):319–344, 2000.
- Jane, john … leslie? a historical method for algorithmic gender prediction. Digit. Humanit. Q., 9, 2015. URL https://api.semanticscholar.org/CorpusID:38649139.
- What’s in a name?–gender classification of names with character based machine learning models. Data Mining and Knowledge Discovery, 35(4):1537–1563, 2021.
- Aligning ai with shared human values, 2023.
- Lauren Rhue (2 papers)
- Sofie Goethals (11 papers)
- Arun Sundararajan (5 papers)