Papers
Topics
Authors
Recent
2000 character limit reached

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life (2410.02683v3)

Published 3 Oct 2024 in cs.CL, cs.AI, and cs.LG

Abstract: As users increasingly seek guidance from LLMs for decision-making in daily life, many of these decisions are not clear-cut and depend significantly on the personal values and ethical standards of people. We present DailyDilemmas, a dataset of 1,360 moral dilemmas encountered in everyday life. Each dilemma presents two possible actions, along with affected parties and relevant human values for each action. Based on these dilemmas, we gather a repository of human values covering diverse everyday topics, such as interpersonal relationships, workplace, and environmental issues. With DailyDilemmas, we evaluate LLMs on these dilemmas to determine what action they will choose and the values represented by these action choices. Then, we analyze values through the lens of five theoretical frameworks inspired by sociology, psychology, and philosophy, including the World Values Survey, Moral Foundations Theory, Maslow's Hierarchy of Needs, Aristotle's Virtues, and Plutchik's Wheel of Emotions. For instance, we find LLMs are most aligned with self-expression over survival in World Values Survey and care over loyalty in Moral Foundations Theory. Interestingly, we find substantial preference differences in models for some core values. For example, for truthfulness, Mixtral-8x7B neglects it by 9.7% while GPT-4-turbo selects it by 9.4%. We also study the recent guidance released by OpenAI (ModelSpec), and Anthropic (Constitutional AI) to understand how their designated principles reflect their models' actual value prioritization when facing nuanced moral reasoning in daily-life settings. Finally, we find that end users cannot effectively steer such prioritization using system prompts.

Citations (1)

Summary

  • The paper introduces DailyDilemmas, a novel dataset of 1,360 daily moral dilemmas to assess LLMs' value alignment.
  • It employs five foundational value theories to analyze biases, revealing notable differences in model performance across diverse ethical dimensions.
  • The study underscores challenges in steering LLM ethics, pointing to discrepancies between stated principles and actual decision-making.

"DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life" Analysis

The paper "DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life" (2410.02683) introduces a novel dataset, DailyDilemmas, designed to evaluate LLMs against moral dilemmas encountered in everyday life. The dataset contains a rich collection of 1,360 moral dilemmas, each consisting of potential actions, affected parties, and invoked human values, revealing how LLMs prioritize values in complex real-world scenarios.

Dataset Construction and Characterization

Structure of DailyDilemmas

The dataset leverages GPT-4 to generate realistic moral dilemmas featuring complex scenarios far beyond clear-cut ethical judgments. The dilemmas encapsulate various topics from interpersonal relationships to societal issues, and each dilemma includes two possible actions annotated with related human values influencing decision-making. Figure 1

Figure 1: DailyDilemmas: Dataset structure and pipeline of collecting by GPT-4 model.

Topic Modeling and Diversity

The dataset embraces diversity in dilemma topics using UMAP visualization to display topic distribution across dilemmas. Through stratified sampling across 17 unique topics, the authors ensure balanced representation of everyday dilemmas. Figure 2

Figure 2: Topic distribution by UMAP on the background of dilemmas in DailyDilemmas.

Theoretical Framework for Value Analysis

Theories for Value Distribution

DailyDilemmas' moral evaluation derives from five foundational theories: World Value Survey, Moral Foundation Theory, Maslow's Hierarchy of Needs, Aristotle's Virtues, and Plutchik Wheel of Emotion, each offering a unique lens to assess the biases and preferences of LLMs in value conflicts.

Bias Disclosure in GPT-4

Analysis indicates GPT-4's higher preference for self-expression over survival values, care over loyalty, and truthfulness is influenced by training data predominantly reflecting Western cultures. An example includes GPT-4's bias towards fairness, overshadowing other moral dimensions like Purity. Figure 3

Figure 3: Value distribution in DailyDilemmas based on five theories that also disclose GPT-4's bias during generation.

Model Evaluation and Preferences

LLMs Preference Analysis

DailyDilemmas enables comprehensive testing of multiple prominent LLMs, comparing preferences across dimensions of five theories. Notably, GPT-4-turbo and Llama-3-70B prioritize higher truthfulness and fairness, while Mixtral-8x7B uniquely shows a neutral stance towards purity values.

Quantitative Results

Evaluation reveals uniform inclination towards self-expression values from all models, contrasting considerable divergence in dimensions like secular-rational values and fairness. Figure 4

Figure 4: Normalized distribution of four representative models on their values preferences based on five theories with reduced dimensions.

Pragmatic Implications and Steerability

Alignment with Organization's Principles

By mapping real-world dilemmas to OpenAI's ModelSpec and Anthropic's Constitutional AI principles, the paper illustrates discrepancies between stated AI values and actual model decision-making in moral solutions. Both Claude-haiku and GPT-4-turbo demonstrated mixed performances, with Claude-haiku notably outperforming in safety prioritization over freedom.

End User Influence on Model Outputs

Attempts to modulate GPT-4 using system prompts demonstrated limitations in steerability, indicating challenges in manipulating LLM-based ethical decision-making at inference-time, presenting a critical insight on user influence over AI’s value alignment. Figure 5

Figure 5

Figure 5: Steerability of GPT-4 by system prompt.

Conclusion

DailyDilemmas stands as a critical resource for understanding the intersection of human values and AI decision-making, evaluating model preferences using complex moral dilemmas rooted in diverse theoretical frameworks. The analysis provides pivotal insights into LLM biases and alignment with corporate ethical standards, with practical implications for future AI systems aiming to balance guiding principles with flexible user control. Continued exploration of diverse ethnic and cultural scenarios, alongside improving steerability, remains imperative for enhanced ethical AI development.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 7 tweets with 39 likes about this paper.