Whose Opinions Do Language Models Reflect? (2303.17548v1)

Published 30 Mar 2023 in cs.CL, cs.AI, cs.CY, and cs.LG

Abstract: LLMs (LMs) are increasingly being used in open-ended contexts, where the opinions reflected by LMs in response to subjective queries can have a profound impact, both on user satisfaction, as well as shaping the views of society at large. In this work, we put forth a quantitative framework to investigate the opinions reflected by LMs -- by leveraging high-quality public opinion polls and their associated human responses. Using this framework, we create OpinionsQA, a new dataset for evaluating the alignment of LM opinions with those of 60 US demographic groups over topics ranging from abortion to automation. Across topics, we find substantial misalignment between the views reflected by current LMs and those of US demographic groups: on par with the Democrat-Republican divide on climate change. Notably, this misalignment persists even after explicitly steering the LMs towards particular demographic groups. Our analysis not only confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs, but also surfaces groups whose opinions are poorly reflected by current LMs (e.g., 65+ and widowed individuals). Our code and data are available at https://github.com/tatsu-lab/opinions_qa.

Authors (6)

Shibani Santurkar (26 papers)
Esin Durmus (38 papers)
Faisal Ladhak (31 papers)
Cinoo Lee (3 papers)
Percy Liang (239 papers)
Tatsunori Hashimoto (80 papers)

Citations (285)

View on Semantic Scholar

Summary

The paper introduces a quantitative framework comparing language model responses with Pew survey data across 60 U.S. demographic groups to assess opinion representativeness.
The study finds that tuned language models often lean toward liberal, educated perspectives, diverging from the broader population's diverse views.
The research demonstrates that despite explicit steering, language models remain only partially adjustable, leaving persistent gaps in demographic alignment.

Analyzing Opinion Reflection in LLMs: Alignment with Demographic Perspectives

The paper "Whose Opinions Do LLMs Reflect?" by Santurkar et al. introduces a quantitative framework for evaluating how LLMs (LMs) align with human opinions, particularly those from diverse demographic groups in the U.S. The work is grounded in the understanding that LMs, when deployed in applications like dialogue agents, have a significant influence on users' perceptions through the opinions they reflect. This paper addresses crucial questions regarding the demographic alignment of LMs and delves deep into understanding whether LMs can accurately represent or mimic human opinions.

Framework and Methodology

The research leverages public opinion surveys, specifically Pew Research Center's American Trends Panels, to create a dataset (\bname) that enables the assessment of LMs’ opinion alignment with human responses across a variety of topics. Some topics include climate change, automation, and abortion, providing a comprehensive landscape for analysis. The framework involves comparing LM responses to survey questions with responses from 60 identified U.S. demographic groups. This setup permits a precise measurement of the degree to which LMs' opinions resonate with those of different demographic groups.

Key Findings

The authors find notable misalignment between LMs’ opinions and U.S. demographic groups, comparable in scale to the well-documented Democrat-Republican divide on climate change. Misalignment persists even after attempts to steer LMs towards particular demographics, indicating challenges in model adjustability and adaptability. Additionally, the paper confirms prior observations about the left-leaning bias in some LMs but goes further by identifying under-represented groups, such as older and widowed individuals.

Overall Representativeness: The paper reveals that none of the evaluated LMs perfectly align with the general U.S. populace's opinions. Models trained with human feedback are observed to shift towards more liberal, educated, and wealthy groups, diverging from the broader population’s views.
Demographic-Specific Alignment: Analysis shows demographic skews, with LMs primarily aligning with lower-income, less-educated groups, and transitioning to liberal and educated demographics post tuning. Groups such as 65+ and widowed are consistently under-represented.
Steerability: While LMs improve in representing specific groups when explicitly prompted, this improvement is limited. None of the models eliminate representational gaps, indicating that simply provided contextual steering does not fully mitigate alignment issues.
Consistency Across Topics: Models demonstrate inconsistent alignment across different topics. For instance, while generally aligning with liberal viewpoints, some LMs reflect conservative views on certain issues like religion. This illustrates that LMs do not present uniform opinion biases across all subject areas.

Implications and Future Directions

This research highlights significant challenges in the quest to make LMs more nuanced and representative of human society's diverse viewpoints. The implications are profound; developers need to consider demographic alignment in model training to prevent skew towards particular opinion sets, especially as LMs become more integrated into decision-influencing roles. Future research should explore more dynamic tuning methods to enhance LMs' adaptability to varying demographic perspectives and address under-representation issues more effectively. Expanding the analysis to include non-U.S. datasets could provide additional insight into global model alignment.

Overall, the paper broadens the understanding of how LMs reflect and propagate human opinions, urging advancements aimed at improving their alignment to ensure they don't disproportionately endorse any demographic-specific narratives. This work is crucial as it directly impacts the role of AI in ethical and unbiased decision-making processes.

PDF Markdown

Related Papers

GitHub

GitHub - tatsu-lab/opinions_qa (86 stars)

Tweets

https://twitter.com/vilevolatile/status/1909832602032521245

https://twitter.com/niloofar_mire/status/1762973585935827019

https://twitter.com/paul_cal/status/1760730426359075062

https://twitter.com/knishimae0531/status/1777911467918217236

YouTube

Show All Videos