Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Turning large language models into cognitive models (2306.03917v1)

Published 6 Jun 2023 in cs.CL, cs.AI, and cs.LG
Turning large language models into cognitive models

Abstract: LLMs are powerful systems that excel at many tasks, ranging from translation to mathematical reasoning. Yet, at the same time, these models often show unhuman-like characteristics. In the present paper, we address this gap and ask whether LLMs can be turned into cognitive models. We find that -- after finetuning them on data from psychological experiments -- these models offer accurate representations of human behavior, even outperforming traditional cognitive models in two decision-making domains. In addition, we show that their representations contain the information necessary to model behavior on the level of individual subjects. Finally, we demonstrate that finetuning on multiple tasks enables LLMs to predict human behavior in a previously unseen task. Taken together, these results suggest that large, pre-trained models can be adapted to become generalist cognitive models, thereby opening up new research directions that could transform cognitive psychology and the behavioral sciences as a whole.

Overview of Research

The paper considers the potential of leveraging LLMs as a basis for cognitive models, addressing the existing gap where LLMs do not consistently demonstrate human-like behavior. The core hypothesis involves fine-tuning pre-trained LLMs with domain-specific data derived from behavioral studies to more accurately replicate human decision-making processes.

Methodology

The authors selected a variety of tasks capturing essential aspects of human decision-making and extracted embeddings from LLM Meta AI (LLaMA), a collection of foundational models available to the research community. They then fine-tuned the model using logistic regression, focusing on two behavioral paradigms: decisions from descriptions and decisions from experience. This led to the creation of CENTaUR, a model amalgamating traits of both LLMs and cognitive modeling. The authors compared the CENTaUR's performance against baseline models on tasks involving decision-making under uncertainty.

Simulation and Analysis of Human-Like Behavior

The paper undertook extensive simulations to verify that CENTaUR accurately represents human-like behavioral characteristics. Performance metrics revealed that CENTaUR closely matched human regret levels and displayed nuanced exploratory behaviors that align with human decision-making strategies. Additional analysis focused on the model's ability to capture individual differences, revealing that incorporating random effects substantially improved fit, reaffirming CENTaUR's aptitude for modeling participant-level behavior.

Generalization to Hold-Out Tasks

To stress-test the model, the authors examined CENTaUR's generalization capability by applying it to a third, previously unseen task. Remarkably, the finetuned CENTaUR was not only successful in generalizing to the new task but showcased a qualitative alignment with human decision-making biases, underscoring the model's potential to predict human behavior across various scenarios.

Implications and Future Directions

The insights presented in this paper illuminate the vast potential of LLMs when contextualized through fine-tuning, steering towards models that can generalize across the spectrum of human cognition. The researchers advocate for scaling the approach to encompass additional tasks from psychological literature, projecting the evolution of a unified, domain-general model of human cognition. Considering the broader scope, the paper paves the way for cognitive and behavioral sciences research, with implications for rapid experiment design, behavioral policy, and a deeper understanding of human behavior through model explainability techniques. The success of such models hinges on the richness of LLM embeddings and their profound ability to encapsulate human cognitive processes.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Marcel Binz (30 papers)
  2. Eric Schulz (33 papers)
Citations (43)