Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior (2407.02099v1)
Abstract: One way to personalize and steer generations from LLMs (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how personas affect diverse aspects of model behavior. We assign to seven LLMs 162 personas from 12 categories spanning variables like gender, sexual orientation, and occupation. We prompt them to answer questions from five datasets covering objective (e.g., questions about math and history) and subjective tasks (e.g., questions about beliefs and values). We also compare persona's generations to two baseline settings: a control persona setting with 30 paraphrases of "a helpful assistant" to control for models' prompt sensitivity, and an empty persona setting where no persona is assigned. We find that for all models and datasets, personas show greater variability than the control setting and that some measures of persona behavior generalize across models.
- Haldun Akoglu. 2018. User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine, 18(3):91–93.
- Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1504–1532, Toronto, Canada. Association for Computational Linguistics.
- W. G. Cochran. 1950. The coparson of percentages in matched samples. Biometrika, 37(3-4):256–266.
- Hate speech and constitutional protection: Priming values of equality and freedom. Journal of Social Issues, 58(2):247–263.
- Toxicity in chatgpt: Analyzing persona-assigned language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1236–1270, Singapore. Association for Computational Linguistics.
- Milton Friedman. 1937. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32(200):675–701.
- Gemma: Open Models Based on Gemini Research and Technology.
- Andrew R. Gilpin. 1993. Table for conversion of kendall’s tau to spearman’s rho within the context of measures of magnitude of effect for meta-analysis. Educational and Psychological Measurement, 53(1):87–92.
- Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs. In The Twelfth International Conference on Learning Representations.
- Measuring Massive Multitask Language Understanding. In International Conference on Learning Representations.
- The Curious Case of Neural Text Degeneration. In Eighth International Conference on Learning Representations.
- Mistral 7b.
- Mixtral of experts.
- Evaluating and Inducing Personality in Pre-trained Language Models. Advances in Neural Information Processing Systems, 36:10622–10643.
- Thomas J Bouchard Jr and Matt McGue. 2003. Genetic and environmental influences on human psychological differences. Journal of Neurobiology, 54(1):4–45.
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274.
- M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika, 30(1/2):81–93.
- The benefits, risks and bounds of personalizing the alignment of large language models to individuals. Nature Machine Intelligence, 6(4):383–392.
- Large Language Models as Superpositions of Cultural Perspectives.
- Klaus Krippendorff. 1970. Estimating the reliability, systematic error and random error of interval data. Educational and psychological measurement, 30(1):61–70.
- Klaus Krippendorff. 2018. Content analysis: An introduction to its methodology. Sage publications.
- TruthfulQA: Measuring How Models Mimic Human Falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3214–3252, Dublin, Ireland. Association for Computational Linguistics.
- John B. McConahay. 1986. Modern racism, ambivalence, and the Modern Racism Scale. In Prejudice, discrimination, and racism, pages 91–125. Academic Press.
- Yusuf Mehdi. 2023. Announcing the next wave of AI innovation with Microsoft Bing and Edge.
- Who is GPT-3? An exploration of personality, values and demographics. In Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS), pages 218–227, Abu Dhabi, UAE. Association for Computational Linguistics.
- OpenAI. 2022. OpenAI: Introducing ChatGPT.
- OpenAI. 2024. OpenAI GPT-3.5 API [gpt-3.5-turbo-0125].
- Gpt-4 technical report.
- BBQ: A hand-built bias benchmark for question answering. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2086–2105, Dublin, Ireland. Association for Computational Linguistics.
- Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution.
- The hierarchical structure of the Interpersonal Reactivity Index. Social Behavior and Personality, 32(4):355–360.
- In-Context Impersonation Reveals Large Language Models’ Strengths and Biases. Advances in Neural Information Processing Systems, 36:72044–72057.
- Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5884–5906, Seattle, United States. Association for Computational Linguistics.
- The Significance of Hedonic Values for Environmentally Relevant Attitudes, Preferences, and Actions. Environment and Behavior, 46(2):163–192.
- Large language models in medicine. Nature Medicine, 29(8):1930–1940.
- Zephyr: Direct distillation of lm alignment.
- Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9677–9705, Singapore. Association for Computational Linguistics.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces, IUI ’22, page 841–852, New York, NY, USA. Association for Computing Machinery.
- Large language models meet NL2Code: A survey. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7443–7464, Toronto, Canada. Association for Computational Linguistics.
- Pedro Henrique Luz de Araujo (8 papers)
- Benjamin Roth (47 papers)