auto-psych: Automating the science of mind using agent-driven theory discovery and experimentation

Published 24 Jun 2026 in cs.AI | (2606.26460v1)

Abstract: AI-based scientific automation is increasingly possible by using agents to generate hypotheses, design experiments, and analyze data. Data collection is a major bottleneck in this pipeline, however. Psychology, and computational cognitive science in particular, is well-positioned to benefit from AI experimentation because theories are often represented as code and crowdsourcing platforms enable programmatic human data collection at scale. Here, we apply automated discovery techniques to the project of generating theories in computational cognitive science, with an agent-based system collecting human data independently through crowdsourced survey experiments. As a testbed, we use a classic case study from cognitive psychology: judging which sequences of coin flips seem subjectively more random. Our system, auto-psych, uses nested agent-based discovery loops to generate explanatory theories of human behavior. The inner loop conjectures, fits, and critiques probabilistic cognitive models; the outer loop designs experiments to test these models, launches them online, and analyzes the data. This system can quickly and reliably recover ground-truth theories from synthetic data via systematic experimentation, but the nested structure is critical to model performance. Further, in three independent sequences of human experiments, the system finds theories that fit the data better than theories generated from the scientific literature. This work thus demonstrates the feasibility of automated data collection and theory discovery in computational cognitive science.

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces an automated agent-driven framework that integrates theory discovery, experimental design, data collection, and model critique in computational cognitive science.
It employs nested agent loops powered by LLMs interfacing with PyMC and jsPsych to robustly recover ground-truth models and outperform classical cognitive theories.
Empirical results from human experiments show discovered models achieving up to 83% explainable variance, marking a significant advancement in automated hypothesis testing.

Automated Theory Discovery and Experimentation in Computational Cognitive Science: A Review of "auto-psych" (2606.26460)

Overview

"auto-psych" presents a fully automated agent-driven framework for theory discovery, experimentation, and model critique in computational cognitive science. The authors address the longstanding bottleneck of human data collection in psychology by integrating agent-based loops capable of generating hypotheses, designing experiments, deploying online studies, collecting data from humans, and refining models entirely autonomously. The testbed is the classic problem of subjective randomness perception in binary sequences, with the AUTO-PSYCH system showing demonstrable success in recovering ground-truth models and outperforming canonical cognitive theories from the literature.

System Architecture

AUTO-PSYCH orchestrates two nested agent loops: an outer loop responsible for model proposal, experiment design, deployment, and data acquisition, and an inner loop centered on model critique and refinement. Both loops are powered by LLM-driven agents interfacing with probabilistic programming libraries (PyMC) for model instantiation and jsPsych for experimental implementation. Experiments are automatically launched and monitored on Prolific via API calls, enabling fully programmatic human data collection.

Outer loop: Seeds the process with literature-informed models, scores candidate stimuli by expected information gain (EIG) over competing models, deploys experiments, and updates the model registry.
Inner loop ("Box's loop"): Fits models with MCMC, generates statistically targeted posterior-predictive critiques, and prompts theorist agents for single-mechanism model refinements.

This architecture enables rapid, iterative theory testing and experimental adjudication, forming an operational closed-loop for scientific inquiry with minimal human intervention.

Empirical Evaluation and Numerical Results

Ground-Truth Model Recovery

AUTO-PSYCH was evaluated against both held-out seed models and psychologically implausible ("alien") models under synthetic data regimes. Across five replicates per condition, the system reliably converged on candidate models with RMSE markedly lower than any seed model alternative, demonstrating estimator-like consistency. Notably, recovery for alien models was even more robust, likely due to their structural simplicity.

Ablation studies revealed significant performance degradation when the inner loop was removed, establishing the critical role of agent-based critique and guided model proposal.

Human Experimentation

Three independent experimental runs, each starting from the same seed models, collected novel behavioral data from 120 participants (three rounds, 40 per round). Unique stimulus pools were generated (>90% uniqueness), and the system discovered models with superior fit relative to all seed theories. Across all experiments, the best-fitting discovered models explained up to 83% of explainable variance ( $R^2=0.664$ for "Minkowski typicality"; noise ceiling $R^2=0.80$ ). Model generalization across runs was validated by fitting on held-out datasets, and performance was stable, further negating overfitting concerns.

Discovered models included:

"Minkowski typicality": prototype-based, penalizing deviations in heads proportion and alternation rate, with penalty power interpolating between Manhattan and Euclidean metric.
"Evidence accumulation": each outcome/run accumulates evidence discounted by deviation from prototype statistics.
"Bayesian diagnosticity + balance": incorporates fairness-based likelihood with additional penalty for balanced sequences.

These models exhibited conceptual divergence from classical literature while yielding similar behavioral predictions, exposing the non-identifiability of underlying cognitive mechanisms from aggregate response data.

Theoretical and Practical Implications

AUTO-PSYCH represents the first demonstration of end-to-end agent-driven scientific automation in psychology, moving beyond previous efforts constrained to model selection on existing datasets. The system's ability to outperform established cognitive theories through chains of autonomous experiments and model refinements reflects both the power and tractability of agent-based "lab-in-the-loop" workflows when combined with scalable programmatic data collection.

From a theoretical perspective, agent-discovered models challenge traditionally held assumptions about randomness perception and prototype-based judgments, enriching the landscape of cognitive modeling. However, most additions represent conservative refinements rather than paradigm shifts.

Practically, AUTO-PSYCH paves the way for scalable exploration of vast model and experiment spaces in psychology and other behavioral sciences. Such systems could enable efficient pre-screening of cognitive hypotheses before resource-intensive neural or physiological studies, enhancing the tempo and abstraction of research.

Limitations and Ethical Considerations

AUTO-PSYCH's proof-of-concept is currently restricted by the bounded stimulus space and schematic nature of the subjective randomness domain. The methodology is generalizable, but adoption in more complex paradigms will require overcoming richer theoretical and design spaces. Additionally, the agent-driven models may exhibit strong numerical fits yet fail to satisfy deeper explanatory desiderata prized by human psychologists.

Automation raises risks of theory homogenization, peer-review overload, and scientific deskilling. To mitigate these, the authors advocate for judicious integration of agent systems with traditional scientific workflows, framing them as tools for augmenting—not replacing—human discovery and interpretation.

Future Directions

Potential developments include extending AUTO-PSYCH to domains with richer stimuli, developing agents capable of integrative multi-phenomena modeling, and exploring meta-scientific properties such as model explainability and interpretability. Integrating automated discovery systems into larger infrastructures could substantially accelerate hypothesis testing cycles across cognitive science and allied fields.

Conclusion

AUTO-PSYCH demonstrates robust agent-driven theory discovery, experimentation, and data-driven model refinement in computational cognitive science, establishing a new benchmark for scalable, automated scientific inquiry. Its empirical results and methodological rigor highlight both the feasibility and necessity of integrating AI agents for accelerating theory development in domains where experimentation is programmatically accessible. The framework invites broader application and critical scrutiny as the field advances toward increasing automation in scientific discovery.