CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting (2404.10199v5)

Published 16 Apr 2024 in cs.CL and cs.AI

Abstract: As the utilization of LLMs has proliferated world-wide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations, and extract symbols from these generations that are associated to each culture by the LLM. We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures. We also discover that LLMs have an uneven degree of diversity in the culture symbols, and that cultures from different geographic regions have different presence in LLMs' culture-agnostic generation. Our findings promote further research in studying the knowledge and fairness of global culture perception in LLMs. Code and Data can be found here: https://github.com/huihanlhh/Culture-Gen/

PDF Abstract

Analyzing Cultural Perceptions in LLMs Through Prompt-Based Generations

The paper "\dataEmojiCulture-Gen: Revealing Global Cultural Perception in LLMs through Natural Language Prompting" explores the examination of cultural biases and knowledge representation inherent to LLMs. The authors critically evaluate the cultural perceptions exhibited by state-of-the-art (SOTA) models, namely GPT-4, LLaMA2-13B, and Mistral-7B, analyzing their ability to generate knowledge about global cultures using cultural context prompts and unveiling underlying biases.

Methodology and Dataset Construction

The authors present an intuitive framework devoid of human bias for extracting cultural perceptions from LLMs. This involves prompting the models to generate text with culture-specific themes across 110 regions or countries on eight cultural topics, such as food, music, and exercises. Utilizing advanced models, the paper reveals the symbolic associations LLMs create with each culture.

Through these controlled prompts, the researchers created the Culture-Gen dataset. Following generation, they used an unsupervised sentence-probability ranking method to assign cultural symbols — defined as entities in model outputs that are perceived to belong to specific cultures — to corresponding cultures based on their generated content.

Analysis of Cultural Representation

The core of the paper's findings highlights significant variability in how different cultures are depicted by these LLMs. Two major issues are explored: cultural markedness and diversity in representation:

Cultural Markedness: The paper explores the linguistic markers or biases wherein models sometimes preface cultural symbols with terms like “traditional,” predominantly featuring in generations related to Asian, Middle Eastern, and African-Islamic regions — effectively 'othering' these cultures versus Western paradigms.
Symbol Diversity: Another dimension of cultural perception evaluated is the richness of cultural knowledge manifesting in diversity of symbols. The authors observe disparities where certain Western-centric countries have richer symbolic diversity within the generation model's training ecosystem than non-Western regions.

Through rigorous evaluation of training data impacts, the research indicates a strong correlation between symbol diversity in outputs and the prevalence of culturally relevant data in the models' training corpora. Notably, data analysis in the open RedPajama dataset illustrates how cultural mentions in training data influence cultural knowledge representation outcomes.

Implications and Future Directions

The paper's insights into these discrepancies underscore the asymmetric cultural representation inherent in several advanced LLMs. For the research community, such findings prompt discussions on cultural parity in AI models, emphasizing the need for broader training data and more sophisticated alignment techniques that account for diverse cultural insights.

In practical application, these findings open avenues for enhancing AI systems' cultural alignment and representation, informing the design of more equitable AI model architectures and training datasets. Future work could enrich model capabilities by incorporating culturally diverse teaching data, enhancing multicultural proficiency and reducing biases stemming from predominant cultural narratives within AI systems.

In conclusion, the research lays the groundwork for informed methodologies tackling cultural perceptions in LLMs, urging the community to pursue comprehensive approaches to training and evaluating AI with cultural awareness and equity at its core. The community must continue exploring how components like instruction tuning and model alignment impact cultural generation behaviors, paving the path for more inclusively capable AI systems moving forward.