Papers
Topics
Authors
Recent
2000 character limit reached

A Unified Geometric Space Bridging AI Models and the Human Brain

Published 28 Oct 2025 in cs.AI | (2510.24342v1)

Abstract: For decades, neuroscientists and computer scientists have pursued a shared ambition: to understand intelligence and build it. Modern artificial neural networks now rival humans in language, perception, and reasoning, yet it is still largely unknown whether these artificial systems organize information as the brain does. Existing brain-AI alignment studies have shown the striking correspondence between the two systems, but such comparisons remain bound to specific inputs and tasks, offering no common ground for comparing how AI models with different kinds of modalities-vision, language, or multimodal-are intrinsically organized. Here we introduce a groundbreaking concept of Brain-like Space: a unified geometric space in which every AI model can be precisely situated and compared by mapping its intrinsic spatial attention topological organization onto canonical human functional brain networks, regardless of input modality, task, or sensory domain. Our extensive analysis of 151 Transformer-based models spanning state-of-the-art large vision models, LLMs, and large multimodal models uncovers a continuous arc-shaped geometry within this space, reflecting a gradual increase of brain-likeness; different models exhibit distinct distribution patterns within this geometry associated with different degrees of brain-likeness, shaped not merely by their modality but by whether the pretraining paradigm emphasizes global semantic abstraction and whether the positional encoding scheme facilitates deep fusion across different modalities. Moreover, the degree of brain-likeness for a model and its downstream task performance are not "identical twins". The Brain-like Space provides the first unified framework for situating, quantifying, and comparing intelligence across domains, revealing the deep organizational principles that bridge machines and the brain.

Summary

  • The paper introduces Brain-like Space and quantitatively maps AI models' attention patterns to human brain networks, revealing an arc-shaped gradient of brain-likeness.
  • The study employs graph theoretic metrics and PCA clustering to compare 151 Transformer models, showing that pretraining paradigms and positional encodings significantly impact brain-likeness.
  • The work highlights that brain-likeness, while reflecting high-level network alignment, is decoupled from downstream task performance, guiding future model design and evaluation.

A Unified Geometric Space Bridging AI Models and the Human Brain

Introduction

This paper introduces the concept of "Brain-like Space," a unified geometric framework for situating and comparing the intrinsic organizational topology of AI models with canonical human functional brain networks. The approach leverages graph-theoretic similarity measures to map the spatial attention patterns of Transformer-based models onto seven functional brain networks derived from resting-state fMRI. The study encompasses 151 Transformer-based models, including large vision models (LVMs), LLMs, and large multimodal models (LMMs), and reveals a continuous arc-shaped geometry within this space, reflecting a gradient of brain-likeness. The analysis demonstrates that brain-likeness is not solely determined by model modality but is systematically influenced by pretraining paradigms and positional encoding schemes.

Construction of Brain-like Space

The methodology involves several key steps:

  1. Functional Brain Network Extraction: A group-level functional connectivity matrix is computed from rs-fMRI data of 1042 subjects, parcellated into seven canonical networks: limbic (LIM), visual (VIS), somatomotor (SMN), dorsal attention (DAN), ventral attention (VAN), frontoparietal (FPN), and default mode (DMN).
  2. Spatial Attention Graphs in AI Models: For each attention head in a Transformer model, a spatial attention graph is constructed, with nodes representing spatial patches and edge weights derived from attention scores.
  3. Graph-Theoretic Metrics: Five metrics—average clustering coefficient, modularity, degree standard deviation, average shortest path length, and global efficiency—are extracted for both brain and model graphs.
  4. Similarity Computation: Cosine similarity between the feature vectors of model attention heads and brain networks yields a seven-dimensional representation for each attention head.
  5. Dimensionality Reduction and Clustering: PCA projects the seven-dimensional space to two dimensions, explaining 96.07% of the variance. K-means clustering (k=4) segments attention heads into clusters reflecting increasing brain-likeness.

Key Findings

Structured Distribution and Model Categories

  • Language-dominant models (LLM, LLM-RoPE, LMM-language, LMM-language-RoPE) are highly concentrated in the most brain-like cluster (C4), with up to 89.2% of attention heads assigned to C4.
  • Vision-dominant models show greater heterogeneity; those emphasizing global semantic abstraction (e.g., ViT-Variants-global-semantic) are more brain-like, while local reconstructive models (e.g., DeiT3, MAE) are less so.
  • Multimodal models exhibit distinct patterns; RoPE-based positional encoding in LMMs facilitates deep fusion and increases brain-likeness in both language and vision components.

Influence of Pretraining Paradigms

  • Data Augmentation: Strategies like AugReg (Mixup, RandAugment) promote global disruption, driving models toward invariance to local distortions and enhancing matches with higher-order cognitive networks. In contrast, 3-Augment (DeiT3) focuses on local stability, limiting brain-likeness.
  • Training Objectives: Semantic abstraction objectives (DINO, DINOv3, BEiT, BEiTv2) yield high brain-likeness and strong matches with FPN and DMN. Detail reconstruction objectives (MAE, DINOv2) bias models toward VIS, reducing brain-likeness.
  • Distillation: CNN-based teacher distillation (DeiT) suppresses global attention mechanisms, shifting models toward local inductive bias and reducing brain-likeness, especially as model scale increases.

Positional Encoding Schemes

  • RoPE-based LLMs and LMMs: RoPE enables a unified geometric prior, facilitating deep cross-modal fusion and increasing brain-likeness, particularly in vision components of multimodal models.
  • Learnable Positional Encoding: Models like CLIP and BLIP exhibit functional localization, with vision components less brain-like and language components highly brain-like, reflecting a division of labor.

Brain-likeness vs. Downstream Task Performance

  • A positive but non-significant correlation (Pearson's r = 0.266, p = 0.1555) exists between brain-likeness scores and ImageNet-1k Top-1 accuracy across 30 vision models. Models optimized for engineering efficiency or robustness may diverge from brain-like organization.

Layer-wise analysis reveals a hierarchical matching pattern: shallow layers align with LIM and VIS, while deeper layers match DAN, VAN, and DMN, mirroring the principal gradient of cortical organization. Larger models develop partial analogs of high-level cognitive processes, but only when scaling is compatible with the training objective.

Implications for Model Design and Evaluation

  • Pretraining Paradigm as Meta-Regulator: Global and semantic-level modeling in pretraining enhances brain-likeness and organizational similarity with higher-order networks. Local detail-focused objectives suppress brain-like structures.
  • Model Scale: Scaling acts as a catalyst for brain-likeness only when paired with compatible pretraining; otherwise, increased parameters may dilute brain-like efficiency.
  • Brain-likeness as an Independent Metric: Brain-likeness should be incorporated as an organizational-interpretive metric, independent of downstream task performance, to enhance model explainability and guide architecture design.

Graph-based Approach and Generalizability

The graph-based framework enables direct, modality-agnostic comparison of AI models and brain networks, facilitating cost-effective and generalizable brain-likeness assessment. The seven-dimensional Brain-like Space and derived brain-likeness score provide operational tools for model evaluation and architectural optimization.

Limitations and Future Directions

  • The current approach focuses on spatial attention; incorporating feature channel interactions could yield a more comprehensive assessment.
  • Finer-grained brain network atlases may improve correspondence.
  • Extension to non-Transformer architectures (MLPs, CNNs) is needed.
  • Dynamic analysis of brain-likeness evolution during reasoning remains an open direction.

Conclusion

This study establishes Brain-like Space as a unified geometric framework for quantifying and comparing the intrinsic organization of AI models and the human brain. The findings demonstrate that brain-likeness emerges from the interplay of architecture, pretraining paradigm, and positional encoding, and is not inherently coupled with downstream task performance. The proposed framework offers a principled approach for advancing a unified science of intelligence, bridging the gap between artificial and biological systems.

Whiteboard

Paper to Video (Beta)

Collections

Sign up for free to add this paper to one or more collections.