Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models? (2406.16316v1)

Published 24 Jun 2024 in cs.CL, cs.AI, cs.CY, and cs.LG

Abstract: Alignment of the LLM with human preferences is a common approach to making a LLM useful to end users. However, most alignment work is done in English, and human preference datasets are dominated by English, reflecting only the preferences of English-speaking annotators. Nevertheless, it is common practice to use the English preference data, either directly or by translating it into the target language, when aligning a multilingual LLM. The question is whether such an alignment strategy marginalizes the preference of non-English speaking users. To this end, we investigate the effect of aligning Japanese LLMs with (mostly) English resources. In particular, we focus on evaluating whether the commonsense morality of the resulting fine-tuned models is aligned with Japanese culture using the JCommonsenseMorality (JCM) and ETHICS datasets. The experimental results show that the fine-tuned model outperforms the SFT model. However, it does not demonstrate the same level of improvement as a model fine-tuned using the JCM, suggesting that while some aspects of commonsense morality are transferable, others may not be.

Citations (2)

View on Semantic Scholar

Summary

The paper shows that Japanese LLMs fine-tuned with culturally-specific (JCM) data yield more accurate commonsense morality predictions than those using Western-centric (ETHICS) data.
The paper employs comparative experiments with JCM and ETHICS datasets and finds that translated training data can mitigate but not fully bridge cultural gaps.
The paper underscores the need for diverse, culture-sensitive datasets to align AI systems with a broad range of moral values and better serve non-English-speaking users.

Cross-Cultural Alignment in LLMs and Its Impact on Commonsense Morality

The paper "Does Cross-Cultural Alignment Change the Commonsense Morality of LLMs?" presents a comprehensive investigation into the implications of aligning multilingual LLMs with predominantly English datasets, focusing specifically on the assessment of commonsense morality within Japanese LLMs. The authors methodically explore the extent to which cultural values and norms are represented in LLMs that have been fine-tuned using English-centric resources, thereby potentially marginalizing non-English-speaking users' preferences.

The core of the examination involves evaluating the alignment of Japanese LLMs with both the JCommonsenseMorality (JCM) dataset and the ETHICS dataset. The research relies on two distinct approaches: fine-tuning Japanese LLMs with the JCM dataset, which represents Japanese cultural norms, and comparing the results with models fine-tuned using the ETHICS dataset, primarily based on Western norms. The experimental outcomes reveal that Japanese models fine-tuned with JCM predict commonsense morality more accurately within Japanese cultural contexts compared to those aligned with the ETHICS dataset.

Interestingly, LLMs trained with the JCM dataset translated into English perform better than those trained with the ETHICS dataset translated into Japanese, suggesting that language discrepancies might be less of a hurdle for cross-linguistic alignment than cultural discrepancies are for cross-cultural alignment. This underscores the importance of acknowledging cultural nuances in multilingual LLM training.

A salient aspect of the paper involves assessing the performance of models trained with Chatbot Arena Conversations translated into Japanese, utilizing preferences annotated by a multilingual reward model. Despite the dataset's English origins, this approach enhances the model's alignment in Japanese, potentially due to the translation preserving linguistic and cultural contexts better than initially expected. However, these models still fall short of those trained directly on the respective cultural datasets, pointing to an intrinsic challenge in generating culturally specific moral reasoning when primarily using translated resources and generalized annotation frameworks.

The implications of this work are twofold. Practically, it highlights the need for culturally diverse datasets for aligning LLMs across different languages, ensuring that AI systems serve diverse user bases without biasing towards any single cultural perspective. Theoretically, it raises questions about the fundamental capability of LLMs to generalize moral reasoning across cultures when primarily trained on datasets reflecting a limited range of cultural values.

Going forward, research can explore optimizing cross-cultural transfer in multilingual models, potentially by developing more sophisticated cross-cultural annotation methodologies or leveraging unsupervised or semi-supervised techniques that natively incorporate cultural diversity. This paper, thus, contributes crucially to the discourse on linguistic and cultural inclusivity in AI, encouraging a future where AI models are aligned with a truly global set of human values.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (1)

Yuu Jinnai

Tweets

https://twitter.com/WGOV/status/1805657906605379712