Centaur: a foundation model of human cognition (2410.20268v2)

Published 26 Oct 2024 in cs.LG

Abstract: Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art LLM on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.

PDF Abstract

Centaur: A Foundation Model of Human Cognition

The paper introduces Centaur, a computational model positioned as a pioneering effort towards establishing a unified model of human cognition. The motivation behind Centaur arises from the historical aspiration in psychology to create a model that comprehensively captures human cognition across multiple domains. While cognitive models rooted in specific tasks have demonstrated utility, Centaur distinguishes itself by aiming for a domain-general application capability across varied scenarios.

Centaur's design leverages a state-of-the-art LLM, Llama 3.1 70B, which is fine-tuned using a massive dataset termed Psych-101. Psych-101 encompasses over 10 million trial-by-trial records from 160 psychological experiments, as captured from more than 60,000 participants. The extensive dataset includes a variety of tasks from domains such as decision-making, multi-armed bandits, and memory, thereby providing a broad foundation for training Centaur. This breadth is key to Centaur's ability to generalize across experiments.

The fine-tuning process applied to Llama uses Quantized Low-Rank Adaptation (QLoRA) to adapt the model’s internal parameters without altering the base structure. This finetuning aligns the model's representations with human-like neural and behavioral patterns. The introduction of parameters tailored to human experimental data specifically enhances the model's fit over domain-specific cognitive models. The paper reports that Centaur predicts human behavior with higher pseudo-R² values than existing cognitive models across almost all examined experiments, demonstrating its efficacy.

Moreover, Centaur exhibits strong generalization capabilities, outperforming in numerous out-of-distribution tests such as novel cover stories, modified task structures, and entirely new domains. This robustness attests to Centaur's potential as a versatile model, capable of simulating and predicting diverse human cognitive processes efficiently.

Critical to the paper's claims is the alignment of Centaur's internal representations with human neural activity. Evaluative studies demonstrate that the model can predict brain activity recorded via fMRI better than the unfined base model, highlighting the utility of training models on large-scale behavioral data sets to improve neural alignment.

The implications of Centaur are vast for both practical applications and theoretical exploration in cognitive sciences. Practically, it offers a tool for simulating human-like behaviors across different scenarios, aiding in experimental prototyping and cognitive science automation. Theoretically, it prompts reconsideration of traditional, domain-specific cognitive theories, suggesting that data-driven discovery of cognitive models might offer a richer understanding of the human mind.

Future work could explore expanding the Psych-101 dataset to include more diverse cognitive domains such as social psychology or psycholinguistics, enhancing its representational capacity. Additionally, training models from scratch using this dataset could further elucidate the underlying architectural principles of human cognition.

In conclusion, Centaur emerges as a compelling candidate for a unified cognitive model, achieving significant strides in domain-general applicability. Through leveraging advancements in LLMs and extensive datasets, it offers a promising pathway toward realizing longstanding goals in cognitive science. As research progresses, Centaur’s framework could catalyze the synthesis of a unified theory of cognition, contributing to a comprehensive understanding of human cognitive processes.