Papers
Topics
Authors
Recent
2000 character limit reached

The brain-AI convergence: Predictive and generative world models for general-purpose computation (2512.02419v1)

Published 2 Dec 2025 in q-bio.NC, cs.AI, cs.CL, and cs.NE

Abstract: Recent advances in general-purpose AI systems with attention-based transformers offer a potential window into how the neocortex and cerebellum, despite their relatively uniform circuit architectures, give rise to diverse functions and, ultimately, to human intelligence. This Perspective provides a cross-domain comparison between the brain and AI that goes beyond the traditional focus on visual processing, adopting the emerging perspecive of world-model-based computation. Here, we identify shared computational mechanisms in the attention-based neocortex and the non-attentional cerebellum: both predict future world events from past inputs and construct internal world models through prediction-error learning. These predictive world models are repurposed for seemingly distinct functions -- understanding in sensory processing and generation in motor processing -- enabling the brain to achieve multi-domain capabilities and human-like adaptive intelligence. Notably, attention-based AI has independently converged on a similar learning paradigm and world-model-based computation. We conclude that these shared mechanisms in both biological and artificial systems constitute a core computational foundation for realizing diverse functions including high-level intelligence, despite their relatively uniform circuit structures. Our theoretical insights bridge neuroscience and AI, advancing our understanding of the computational essence of intelligence.

Summary

  • The paper posits that world-model-based computation unifies biological and artificial intelligence through prediction-error learning.
  • The paper rigorously maps neocortex and cerebellum architectures to transformer and RNN systems using circuit, I/O, and learning analyses.
  • The paper highlights that scaling, hierarchical attention, and mixture-of-experts mechanisms are essential for emergent in-context learning and flexible AI.

World-Model-Based Computation: Brain-AI Convergence in Predictive and Generative Processing

Conceptual Framework and Rationale

This paper advances a structured theoretical framework for understanding robust computational convergence between biological intelligence (neocortex and cerebellum) and artificial general intelligence built upon transformer architectures. The authors posit that, beyond previously narrow comparisons centered on visual processing, both biological and artificial circuits utilize world models to undergird prediction, comprehension, and generation across diverse functional domains. World models are internal simulators constructed from experience through prediction-error learning, and, critical to both systems, these models abstract and generalize representations to enable flexible adaptation and general-purpose computation.

The tripartite analysis—circuit architecture, input/output transformation, and learning algorithms—enables a rigorous mapping between the neocortex’s hierarchical attention-based recurrent architectures and the cerebellum’s large-scale three-layer RNN structure, juxtaposed against transformer and RNN-based AI systems. A foundational theoretical claim is that predictive coding, previously formulated for the neocortex, and internal model theory for the cerebellum both instantiate unsupervised prediction-error learning paradigms, contradicting prior dichotomizations (e.g., neocortex as unsupervised, cerebellum as supervised).

Sensory and Cognitive Processing: Predictive Coding and World Models

Neocortical World Models

The neocortex harnesses hierarchical predictive coding to build representations via unsupervised prediction-error minimization, compressing sensory input into progressively abstract features. Intermediate layers perform information compression (autoencoders, CNNs), while recurrent circuits enable temporal integration, capturing dynamic sensory sequences beyond static image processing. Experimental and modeling evidence supports deep RNN architectures for such predictive processing.

Transformer-Based Language AI

Large-scale transformer models, notably BERT and GPT, show substantive architectural and functional alignment with neocortical processing. Transformer blocks performing multi-head attention and masked-word prediction mimic attentional and predictive mechanisms found at multiple neocortical hierarchical levels. GPT, trained on next-word prediction, achieves sentence generation repurposing its predictive system with minimal supervised fine-tuning. Numerically, GPT reaches top-tier language task accuracy and, uniquely, transformer signals during language comprehension most closely resemble human neocortical activity [116].

Cerebellar World Models

The cerebellum’s contribution extends beyond motor planning to include sensory and language processing, employing prediction-error learning (~ supervised learning by historical definition, but fundamentally unsupervised as per the paper’s logic). A three-layer RNN model trained for next-word prediction spontaneously reproduces both word prediction and syntactic parsing functions, indicating emergence of high-level cognitive capability within simple architectures.

Motor Processing, Language Generation, and Unified World-Model Utilization

Integration of Sensory and Motor Computation

Empirical data show overlapping neural substrates for comprehension and production in the neocortex (Broca, Wernicke) and cerebellum (right lateral regions), paralleling unified predictive and generative processing in transformer architectures (integrated encoder-decoder in GPT). The “mirror neuron system” traditionally in sensorimotor domains is reconceptualized here as a world-model system: trained via prediction-error learning, it is repurposed for both predicting/understanding others' actions and planning/generating self-actions.

Practical AI: Language and Action Planning

Transformers and video-generative AI models (UniSim, PaLM-SayCan, Stable Diffusion) demonstrate direct repurposing of world models for robotic action planning, strategy generation, and generalized adaptive task execution—mirroring mirror neuron functionality. The practical implication is that generalist agents grounded in prediction-error-trained world models can both interpret and generate complex, goal-directed behaviors.

In-Context Learning and Mixture-of-Experts: Human-Like Adaptability

GPT-3 and successors manifest “in-context learning” capability, in which prompt-derived contextual information is utilized to dynamically modulate input-output transformations without further synaptic adjustment, paralleling human cognitive flexibility. Scientific results demonstrate that scaling (model and dataset size), attentional computation, and modular specialization are critical for emergent in-context learning. Mixure-of-Experts (MoE) architectures further increase efficiency and modularity—functionally analogous to flexible expert recruitment in lateral prefrontal cortex.

Contradictory and Strong Claims

The paper asserts, with comprehensive cross-domain evidence, that the neocortex and cerebellum operate according to fundamentally uniform computational principles—unsupervised prediction-error learning and world-model construction—even as their circuit architectures diverge. This directly contradicts longstanding claims regarding distinct neural learning paradigms and functional specializations. Moreover, the assertion that hierarchical attention is the core principle for neocortical computation de-emphasizes “ad hoc” modularity, suggesting a unification that reframes classical neuroscientific distinctions.

Implications for Intelligence and Future AI

Theoretical Advances

The theory advanced implicates world-model-based computation as the necessary and sufficient basis for general-purpose intelligence, challenging modular or task-specific orthodoxy. Intelligence, in both brain and AI contexts, is argued to arise from scalable, attention-driven architectures rather than intricate circuit design. This insight connects AI's empirical scaling laws (Kaplan et al.), the “lottery ticket hypothesis,” and synaptic pruning dynamics in biological systems: large-scale circuits provide the substrate for rapid adaptation and emergent flexibility, with only a small fraction necessary for high-level function post-learning.

Practical Directions

This world-model-centric paradigm has actionable consequences for AI: incorporating brain-like attention, prediction-error learning, scaling, and modular expert-selection mechanisms is expected to yield more efficient, flexible, and general-purpose systems. Simulating probabilistic impairment of learning in AI circuits may elucidate mechanisms underlying neurodevelopmental disorders, enabling translational advances in computational neurology.

Future Research

Bridging neuroscience and AI through reciprocal refinement of computational principles is anticipated to extend beyond neocortex and cerebellum, with reinforcement learning, subcortical processing, and complex cognitive capacities standing as next frontiers. Novel architectures inspired by biological convergence (hierarchical attention, MoE) are likely to realize intelligence surpassing task-specific constraints.

Conclusion

This paper provides a comprehensive formulation of world-model-based predictive and generative computation as the computational substrate of intelligence in both brain and AI. Hierarchical attention, large-scale architecture, prediction-error learning, and expert modularity comprise unifying principles enabling prediction, comprehension, and generation. These findings reframe conventional dichotomies, establish actionable guidelines for AI system design, and open new theoretical and translational avenues towards understanding and emulating adaptive intelligence (2512.02419).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We found no open problems mentioned in this paper.

Authors (2)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 3 tweets with 2 likes about this paper.

Youtube Logo Streamline Icon: https://streamlinehq.com