Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models (2309.08902v3)
Abstract: LLMs are increasingly powerful and widely used to assist users in a variety of tasks. This use risks the introduction of LLM biases to consequential decisions such as job hiring, human performance evaluation, and criminal sentencing. Bias in NLP systems along the lines of gender and ethnicity has been widely studied, especially for specific stereotypes (e.g., Asians are good at math). In this paper, we investigate bias along less-studied but still consequential, dimensions, such as age and beauty, measuring subtler correlated decisions that LLMs make between social groups and unrelated positive and negative attributes. We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology. We introduce a template-generated dataset of sentence completion tasks that asks the model to select the most appropriate attribute to complete an evaluative statement about a person described as a member of a specific social group. We also reverse the completion task to select the social group based on an attribute. We report the correlations that we find for 4 cutting-edge LLMs. This dataset can be used as a benchmark to evaluate progress in more generalized biases and the templating technique can be used to expand the benchmark with minimal additional human annotation.
- Bernard S Aaronson. 1966. Personality stereotypes of aging. Journal of Gerontology, 21(3):458–462.
- Personality trait inferences about organizations and organizational attraction: An organizational-level analysis based on a multi-cultural sample. Journal of Management & Organization, 16(1):140–150.
- Norman H Anderson. 1968. Likableness ratings of 555 personality-trait words. Journal of personality and social psychology, 9(3):272.
- Seymour Axelrod and Carl Eisdorfer. 1961. Attitudes toward old people: an empirical analysis of the stimulus-group validity of the tuckman-lorge questionnaire. Journal of Gerontology.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Robert N Butler. 1969. Age-ism: Another form of bigotry. The gerontologist, 9(4_Part_1):243–246.
- Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183–186.
- Paul Cameron. 1969. Age parameters of young adult, middle-aged, old, and aged. Journal of Gerontology, 24(2):201–202.
- Melissa Commisso and Lisa Finkelstein. 2012. Physical attractiveness bias in employee termination. Journal of Applied Social Psychology, 42(12):2968–2987.
- Facial attractiveness, weight status, and personality trait attribution: The role of attractiveness in weight stigma. Obesity research & clinical practice, 11(4):377–388.
- Quantifying social biases in NLP: A generalization and empirical comparison of extrinsic fairness metrics. Transactions of the Association for Computational Linguistics, 9:1249–1267.
- What is beautiful is good. Journal of personality and social psychology, 24(3):285.
- Alan E Gross and Christine Crofton. 1977. What is good is beautiful. Sociometry, pages 85–90.
- Martin Humburg. 2017. Personality and field of study choice in university. Education Economics, 25(4):366–378.
- Gary S Insch and J Brad McBride. 2004. The impact of country-of-origin cues on consumer perceptions of product quality: A binational test of the decomposed country-of-origin construct. Journal of business research, 57(3):256–265.
- M. G. Kendall. 1938. A new measure of rank correlation. Biometrika, 30(1/2):81–93.
- Measuring bias in contextualized word representations. arXiv preprint arXiv:1906.07337.
- Gender and attractiveness biases in hiring decisions: Are more experienced managers less biased? Journal of applied psychology, 81(1):11.
- Margaret Maurer-Fazio and Lei Lei. 2015. “as rare as a panda”: How facial attractiveness, gender, and occupation affect interview callbacks at chinese firms. International Journal of Manpower, 36(1):68–85.
- Ioanna Mavridopoulou and Joe O’Mahoney. 2020. Elitism in strategy consulting: How institutional prestige influences recruitment. Management Consulting Journal, 3(1):12–22.
- On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561.
- Nationality and gender biases in multicultural online learning environments: The effects of anonymity. In Proceedings of the 2020 CHI conference on human factors in computing systems, pages 1–14.
- Louise Morley and Sarah Aynsley. 2007. Employers, quality and standards in higher education: Shared values and vocabularies or elitism and inequalities? Higher Education Quarterly, 61(3):229–249.
- Laura Moy. 2019. A taxonomy of police technology’s racial inequity problems. Social Science Research Network.
- Stereoset: Measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:2004.09456.
- StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5356–5371, Online. Association for Computational Linguistics.
- Crows-pairs: A challenge dataset for measuring social biases in masked language models. arXiv preprint arXiv:2010.00133.
- Thomas WH Ng and Daniel C Feldman. 2012. Evaluating six common stereotypes about older workers with meta-analytical data. Personnel psychology, 65(4):821–858.
- OpenAI. 2023. Gpt-4 technical report.
- Charles W Perdue and Michael B Gurtman. 1990. Evidence for the automaticity of ageism. Journal of Experimental Social Psychology, 26(3):199–216.
- Brand personality in higher education: anthropomorphized university marketing communications. Journal of Marketing for Higher Education, 27(1):19–39.
- Cognitive colonialism: Nationality bias in brazilian academic philosophy. Metaphilosophy, 54(1):106–118.
- Personality trait inferences about organizations: development of a measure and assessment of construct validity. Journal of applied psychology, 89(1):85.
- “I’m sorry to hear that”: Finding new biases in language models with a holistic descriptor dataset. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9180–9211, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- BERTScore is unfair: On social bias in language model-based metrics for text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3726–3739, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Nationality biases in peer evaluations: The country-of-origin effect in global virtual teams. International Business Review, 31(2):101969.
- Llama 2: Open foundation and fine-tuned chat models.
- Nationality bias in text generation. arXiv preprint arXiv:2302.02463.
- Blaming the victim: The effects of extraversion and information disclosure on guilt attributions in cyberbullying. Cyberpsychology, Behavior, and Social Networking, 16(4):254–259.
- Jaclyn S Wong and Andrew M Penner. 2016. Gender and the returns to attractiveness. Research in Social Stratification and Mobility, 44:113–123.
- Sociolectal analysis of pretrained language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4581–4588, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Mahammed Kamruzzaman (10 papers)
- Md. Minul Islam Shovon (2 papers)
- Gene Louis Kim (13 papers)