Investigating Cultural Alignment of Large Language Models
Abstract: The intricate relationship between language and culture has long been a subject of exploration within the realm of linguistic anthropology. LLMs, promoted as repositories of collective human knowledge, raise a pivotal question: do these models genuinely encapsulate the diverse knowledge adopted by different cultures? Our study reveals that these models demonstrate greater cultural alignment along two dimensions -- firstly, when prompted with the dominant language of a specific culture, and secondly, when pretrained with a refined mixture of languages employed by that culture. We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references. Specifically, we replicate a survey conducted in various regions of Egypt and the United States through prompting LLMs with different pretraining data mixtures in both Arabic and English with the personas of the real respondents and the survey questions. Further analysis reveals that misalignment becomes more pronounced for underrepresented personas and for culturally sensitive topics, such as those probing social values. Finally, we introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment. Our study emphasizes the necessity for a more balanced multilingual pretraining dataset to better represent the diversity of human experience and the plurality of different cultures with many implications on the topic of cross-lingual transfer.
- Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’21, page 298–306, New York, NY, USA. Association for Computing Machinery.
- Taqyim: Evaluating arabic nlp tasks using chatgpt models. ArXiv, abs/2306.16322.
- Writing culture.
- Toxicity in chatgpt: Analyzing persona-assigned language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1236–1270, Singapore. Association for Computational Linguistics.
- Towards measuring the representation of subjective global opinions in language models.
- Discovering language-neutral sub-networks in multilingual language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7560–7575, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- World values survey wave 7 (2017-2020) cross-national data-set.
- Acegpt, localizing large language models in arabic.
- Social biases in NLP models as barriers for persons with disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5491–5501, Online. Association for Computational Linguistics.
- Personas as a way to model truthfulness in language models. ArXiv, abs/2310.18168.
- Gptaraeval: A comprehensive evaluation of chatgpt on arabic nlp.
- A. L. Kroeber and Clyde Kluckhohn. 1952. Culture: A Critical Review of Concepts and Definitions. Peabody Museum Press, Cambridge, Massachusetts.
- Improving diversity of demographic representation in large language models via collective-critiques and self-voting. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10383–10405, Singapore. Association for Computational Linguistics.
- Few-shot learning with multilingual generative language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9019–9052, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Gender and representation bias in GPT-3 generated stories. In Proceedings of the Third Workshop on Narrative Understanding, pages 48–55, Virtual. Association for Computational Linguistics.
- Crosslingual generalization through multitask finetuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15991–16111, Toronto, Canada. Association for Computational Linguistics.
- Having beer after prayer? measuring cultural bias in large language models. ArXiv, abs/2305.14456.
- Nationality bias in text generation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 116–122, Dubrovnik, Croatia. Association for Computational Linguistics.
- Lifting the curse of multilinguality by pre-training modular transformers. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3479–3495, Seattle, United States. Association for Computational Linguistics.
- Cross-lingual consistency of factual knowledge in multilingual language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10650–10666, Singapore. Association for Computational Linguistics.
- The language barrier: Dissecting safety challenges of llms in multilingual contexts. ArXiv, abs/2401.13136.
- Societal biases in language generation: Progress and challenges. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4275–4293, Online. Association for Computational Linguistics.
- The woman worked as a babysitter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3407–3412, Hong Kong, China. Association for Computational Linguistics.
- Do llms exhibit human-like response biases? a case study in survey design. ArXiv, abs/2311.04076.
- Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288.
- Edward B. Tylor. 1871. Primitive culture : researches into the development of mythology, philosophy, religion, art, and custom, 3rd ed., rev edition. John Murray London, London.
- Instruction tuning for large language models: A survey. ArXiv, abs/2308.10792.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.