CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models (2405.13974v1)

Published 22 May 2024 in cs.CL and cs.AI

Abstract: This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural variation of LLMs across multiple languages and value-sensitive topics. We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy. CIVICS is designed to generate responses showing LLMs' encoded and implicit values. Through our dynamic annotation processes, tailored prompt design, and experiments, we investigate how open-weight LLMs respond to value-sensitive issues, exploring their behavior across diverse linguistic and cultural contexts. Using two experimental set-ups based on log-probabilities and long-form responses, we show social and cultural variability across different LLMs. Specifically, experiments involving long-form responses demonstrate that refusals are triggered disparately across models, but consistently and more frequently in English or translated statements. Moreover, specific topics and sources lead to more pronounced differences across model answers, particularly on immigration, LGBTQI rights, and social welfare. As shown by our experiments, the CIVICS dataset aims to serve as a tool for future research, promoting reproducibility and transparency across broader linguistic settings, and furthering the development of AI technologies that respect and reflect global cultural diversities and value pluralism. The CIVICS dataset and tools will be made available upon publication under open licenses; an anonymized version is currently available at https://huggingface.co/CIVICS-dataset.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces the CIVICS dataset designed to examine LLMs' cultural values through native-verified, multilingual prompts on sensitive socio-political issues.
It employs a dynamic annotation process based on human rights principles to label themes like anti-discrimination and gender inclusivity.
Experimental results reveal response variability by language and model size, highlighting the need for culturally adaptive, ethically aligned AI systems.

An Overview of the CIVICS Dataset for Culturally-Informed Values in LLMs

The paper "CIVICS: Building a Dataset for Examining Culturally-Informed Values in LLMs" presents an in-depth exploration of the CIVICS dataset, which is designed to assess the socio-cultural impacts of LLMs through a multilingual and value-sensitive lens. This dataset aims to provide a comprehensive evaluation tool that highlights the encoded and implicit value systems within LLMs, across multiple languages and culturally nuanced issues.

Dataset Design and Scope

CIVICS stands for "Culturally-Informed Values-Inclusive Corpus for Societal impacts," a carefully curated set of value-laden prompts that address complex and socially sensitive topics. These topics encompass LGBTQI rights, social welfare, immigration, disability rights, and surrogacy—each chosen for their prominence in contemporary socio-political discourse across various regions. The dataset is multilingual, spanning five languages and integrating both Western and non-Western perspectives through the inclusion of culturally rich languages such as Italian, German, Turkish, and both European and Canadian French.

The authors have meticulously avoided automated translation methods to preserve cultural contexts, instead relying on contributions from native speakers to ensure linguistic accuracy and authenticity. This approach enhances the dataset's relevance, enabling it to capture the subtleties of value expressions as they naturally occur within each culture.

Methodology and Experimentation

The researchers employed a dynamic annotation process to label the dataset with culturally resonant values, guided by principles from globally recognized human rights documents. These labels encompass themes of anti-discrimination, gender inclusivity, and human dignity, among others, allowing for a nuanced exploration of how LLMs mediate culturally sensitive issues.

To illustrate the application of CIVICS, the paper outlines a series of experimental setups that examine LLM behaviors—specifically focusing on two methodologies: evaluating log-probabilities for next-token prediction and analyzing long-form responses to prompts. These investigations reveal significant variabilities in the ways different LLMs respond to ethical prompts, with notable differences across both the linguistic and cultural contexts.

Key Findings and Implications

The authors note that larger models tend to produce more divergent agreement ratings, suggesting a potential relationship between model size and the breadth of encoded value systems. Furthermore, the experiments highlight that certain topics, particularly immigration and LGBTQI rights, frequently trigger refusals to answer—indicating a heightened sensitivity or conflict within these areas across various models.

The findings emphasize the critical need for developing AI systems that not only perform functionally but also adhere to culturally-inclusive and ethically sound standards. The CIVICS dataset offers a framework for understanding and interrogating the implicit value systems within AI, promoting transparency and accountability in their deployment.

Theoretical and Practical Implications

The research presented in this paper has both theoretical and practical implications for the field of AI ethics. Theoretically, it contributes to the discourse on value pluralism and the ethical considerations inherent in AI deployment across diverse cultural settings. Practically, CIVICS serves as a tool for researchers and practitioners aiming to enhance the cultural adaptability and ethical alignment of AI systems.

Future developments in AI prompted by this research could involve the expansion of such datasets to include more nuanced cultural contexts and the integration of advanced evaluation metrics for assessing the socio-cultural impacts of AI. This work lays the groundwork for a more inclusive and thorough understanding of LLM outputs, urging ongoing adaptations to ensure AI technologies reflect shared global values and cultural diversities.

The paper provides an essential contribution to the ongoing conversation about fostering culturally respectful AI, advocating for broader, more inclusive investigations into the societal impacts of these powerful models. Through CIVICS, the researchers challenge the AI community to consider the complexities of cross-cultural interactions with AI and the imperative of embedding ethical considerations at the forefront of technological development.

PDF Markdown

Related Papers

Tweets

https://twitter.com/GiadaPistilli/status/1803427782355005595

https://twitter.com/ClementDelangue/status/1798723870737699122

https://twitter.com/fdaudens/status/1798725856438952153

https://twitter.com/mmitchell_ai/status/1826048519213552124

https://twitter.com/nitya/status/1799076917922206064