- The paper introduces an automated agent-driven framework that integrates theory discovery, experimental design, data collection, and model critique in computational cognitive science.
- It employs nested agent loops powered by LLMs interfacing with PyMC and jsPsych to robustly recover ground-truth models and outperform classical cognitive theories.
- Empirical results from human experiments show discovered models achieving up to 83% explainable variance, marking a significant advancement in automated hypothesis testing.
Automated Theory Discovery and Experimentation in Computational Cognitive Science: A Review of "auto-psych" (2606.26460)
Overview
"auto-psych" presents a fully automated agent-driven framework for theory discovery, experimentation, and model critique in computational cognitive science. The authors address the longstanding bottleneck of human data collection in psychology by integrating agent-based loops capable of generating hypotheses, designing experiments, deploying online studies, collecting data from humans, and refining models entirely autonomously. The testbed is the classic problem of subjective randomness perception in binary sequences, with the AUTO-PSYCH system showing demonstrable success in recovering ground-truth models and outperforming canonical cognitive theories from the literature.
System Architecture
AUTO-PSYCH orchestrates two nested agent loops: an outer loop responsible for model proposal, experiment design, deployment, and data acquisition, and an inner loop centered on model critique and refinement. Both loops are powered by LLM-driven agents interfacing with probabilistic programming libraries (PyMC) for model instantiation and jsPsych for experimental implementation. Experiments are automatically launched and monitored on Prolific via API calls, enabling fully programmatic human data collection.
- Outer loop: Seeds the process with literature-informed models, scores candidate stimuli by expected information gain (EIG) over competing models, deploys experiments, and updates the model registry.
- Inner loop ("Box's loop"): Fits models with MCMC, generates statistically targeted posterior-predictive critiques, and prompts theorist agents for single-mechanism model refinements.
This architecture enables rapid, iterative theory testing and experimental adjudication, forming an operational closed-loop for scientific inquiry with minimal human intervention.
Empirical Evaluation and Numerical Results
Ground-Truth Model Recovery
AUTO-PSYCH was evaluated against both held-out seed models and psychologically implausible ("alien") models under synthetic data regimes. Across five replicates per condition, the system reliably converged on candidate models with RMSE markedly lower than any seed model alternative, demonstrating estimator-like consistency. Notably, recovery for alien models was even more robust, likely due to their structural simplicity.
Ablation studies revealed significant performance degradation when the inner loop was removed, establishing the critical role of agent-based critique and guided model proposal.
Human Experimentation
Three independent experimental runs, each starting from the same seed models, collected novel behavioral data from 120 participants (three rounds, 40 per round). Unique stimulus pools were generated (>90% uniqueness), and the system discovered models with superior fit relative to all seed theories. Across all experiments, the best-fitting discovered models explained up to 83% of explainable variance (R2=0.664 for "Minkowski typicality"; noise ceiling R2=0.80). Model generalization across runs was validated by fitting on held-out datasets, and performance was stable, further negating overfitting concerns.
Discovered models included:
- "Minkowski typicality": prototype-based, penalizing deviations in heads proportion and alternation rate, with penalty power interpolating between Manhattan and Euclidean metric.
- "Evidence accumulation": each outcome/run accumulates evidence discounted by deviation from prototype statistics.
- "Bayesian diagnosticity + balance": incorporates fairness-based likelihood with additional penalty for balanced sequences.
These models exhibited conceptual divergence from classical literature while yielding similar behavioral predictions, exposing the non-identifiability of underlying cognitive mechanisms from aggregate response data.
Theoretical and Practical Implications
AUTO-PSYCH represents the first demonstration of end-to-end agent-driven scientific automation in psychology, moving beyond previous efforts constrained to model selection on existing datasets. The system's ability to outperform established cognitive theories through chains of autonomous experiments and model refinements reflects both the power and tractability of agent-based "lab-in-the-loop" workflows when combined with scalable programmatic data collection.
From a theoretical perspective, agent-discovered models challenge traditionally held assumptions about randomness perception and prototype-based judgments, enriching the landscape of cognitive modeling. However, most additions represent conservative refinements rather than paradigm shifts.
Practically, AUTO-PSYCH paves the way for scalable exploration of vast model and experiment spaces in psychology and other behavioral sciences. Such systems could enable efficient pre-screening of cognitive hypotheses before resource-intensive neural or physiological studies, enhancing the tempo and abstraction of research.
Limitations and Ethical Considerations
AUTO-PSYCH's proof-of-concept is currently restricted by the bounded stimulus space and schematic nature of the subjective randomness domain. The methodology is generalizable, but adoption in more complex paradigms will require overcoming richer theoretical and design spaces. Additionally, the agent-driven models may exhibit strong numerical fits yet fail to satisfy deeper explanatory desiderata prized by human psychologists.
Automation raises risks of theory homogenization, peer-review overload, and scientific deskilling. To mitigate these, the authors advocate for judicious integration of agent systems with traditional scientific workflows, framing them as tools for augmenting—not replacing—human discovery and interpretation.
Future Directions
Potential developments include extending AUTO-PSYCH to domains with richer stimuli, developing agents capable of integrative multi-phenomena modeling, and exploring meta-scientific properties such as model explainability and interpretability. Integrating automated discovery systems into larger infrastructures could substantially accelerate hypothesis testing cycles across cognitive science and allied fields.
Conclusion
AUTO-PSYCH demonstrates robust agent-driven theory discovery, experimentation, and data-driven model refinement in computational cognitive science, establishing a new benchmark for scalable, automated scientific inquiry. Its empirical results and methodological rigor highlight both the feasibility and necessity of integrating AI agents for accelerating theory development in domains where experimentation is programmatically accessible. The framework invites broader application and critical scrutiny as the field advances toward increasing automation in scientific discovery.