Analyzing Rationality Bias in LLMs: A Study on Human Decision-Making
The paper "LLMs Assume People are More Rational than We Really are" explores a critical analysis of how LLMs perceive human decision-making processes, particularly through the lens of rationality. The research interrogates the implicit assumptions encoded in state-of-the-art LLMs such as GPT-4, Llama, and Claude, challenging the notion that these models can accurately simulate or predict human behavior. This investigation engages with foundational theories in psychology and cognitive science to draw conclusions about LLMs' alignment with human rationality expectations versus actual behavior.
The research outlines two primary types of modeling tasks to evaluate LLMs: forward modeling and inverse modeling. In forward modeling, the objective is to predict which option among gambles humans will choose. The paper utilizes a well-documented dataset, choices13k, which consists of over 13,000 decision-making problems where human choices between probabilistic gambles were recorded. The LLMs were prompted to predict these choices under different conditions: zero-shot and chain-of-thought prompting, with the latter providing a structured reasoning process to the models.
The results indicate a discrepancy between human decision-making and LLM predictions. Notably, when LLMs utilize chain-of-thought prompts, they align more closely with the concept of rational choice theory—specifically, expected value theory—than with human behavior. For example, GPT-4 demonstrates a correlation of over 0.93 with the rational model but only 0.64 with actual human decisions. This finding suggests a bias within LLMs, shaped likely by the rational-leaning training data they are exposed to, which often excludes human errors and fallacies.
The second experimental paradigm, inverse modeling, explores how LLMs infer human preferences based on observed decisions. This task aligns with cognitive models where observers infer the intentions behind others' choices. Using a set from Jern et al.'s work involving ranking decision strength, the paper assessed the correlation between LLM inferences and human inferences. Here, the findings are more concordant—LLMs tend to make inferences that echo human expectations of rationality. Intriguingly, stronger models like GPT-4 highly correlate with both human expectations and rational theories (Spearman correlation of up to 0.97 with human inference patterns).
These experiments reveal critical implications for AI alignment research and the application of LLMs in simulating human behavior. The discrepancy between LLM predictions in forward tasks and human decisions highlights the necessity for models trained to align with human behavior, not just idealized rationality. The paper suggests potentially bifurcating alignment strategies: one focusing on meeting human expectations (which assumes rationality in others), and the other on true-to-life human behavior (capturing the irrational nuances).
The insights gained from this research provoke a reconsideration of the efficacy of LLMs in human interaction tasks beyond mere language comprehension. The implicit rationality bias embedded within these models can lead to fundamental misalignments in real-world applications ranging from simulating human participants in social experiments to designing human-AI interaction protocols. This calls for a more nuanced approach in training AI systems, incorporating methodologies from cognitive science that account for the inherently imperfect and sometimes irrational nature of human decision-making.
The broader impacts addressed by this paper urge a reflective discourse on AI deployment in socially and ethically sensitive contexts, promoting a vision of AI systems that can model not just the rational archetype but the varied spectrum of human cognition and behavior. As AI continues to evolve, integrating insights from psychology and cognitive science will be crucial in molding technology that truly understands and complements human capacities.