Language Model Alignment in Multilingual Trolley Problems (2407.02273v5)

Published 2 Jul 2024 in cs.CL

Abstract: We evaluate the moral alignment of LLMs with human preferences in multilingual trolley problems. Building on the Moral Machine experiment, which captures over 40 million human judgments across 200+ countries, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP. This dataset enables the assessment of LLMs' decision-making processes in diverse linguistic contexts. Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions: species, gender, fitness, status, age, and the number of lives involved. By correlating these preferences with the demographic distribution of language speakers and examining the consistency of LLM responses to various prompt paraphrasings, our findings provide insights into cross-lingual and ethical biases of LLMs and their intersection. We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems and highlighting the importance of incorporating diverse perspectives in AI ethics. The results underscore the need for further research on the integration of multilingual dimensions in responsible AI research to ensure fair and equitable AI interactions worldwide. Our code and data are at https://github.com/causalNLP/moralmachine

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (68)

Authors (12)

Zhijing Jin (68 papers)
Sydney Levine (12 papers)
Max Kleiman-Weiner (20 papers)
Giorgio Piatti (3 papers)
Jiarui Liu (34 papers)
Francesco Ortu (4 papers)
András Strausz (3 papers)
Mrinmaya Sachan (124 papers)
Rada Mihalcea (131 papers)
Yejin Choi (287 papers)
Bernhard Schölkopf (412 papers)
Fernando Gonzalez (8 papers)

Tweets

https://twitter.com/ZhijingJin/status/1911811595098878026

https://twitter.com/michigan_AI/status/1868683820867956851

https://twitter.com/Tianyi_Alex_Qiu/status/1812898347935801849

Language Model Alignment in Multilingual Trolley Problems (2407.02273v5)

Related Papers

Tweets