Brain Foundation Models
- BFMs are large-scale neural architectures pre-trained via self-supervised learning on diverse brain data to create universal, transferable representations.
- They integrate innovations such as masked signal modeling, contrastive learning, and dynamic task adaptation to excel in clinical, cognitive, and neuroimaging applications.
- Empirical results show state-of-the-art performance in cross-subject BCIs, neuroimaging segmentation, and cognitive decoding while addressing modality challenges.
Brain Foundation Models (BFMs) are large-scale, pre-trained neural architectures designed for universal representation and transfer learning across diverse brain data modalities, tasks, and application domains. Originating from the “foundation model” paradigm, BFMs leverage massive unlabeled or weakly labeled neural datasets—including EEG, fMRI, MEG, and behavioral recordings—which they process using self-supervised objectives to produce generic, transferable feature embeddings. These embeddings enable rapid, data-efficient adaptation to downstream tasks such as clinical diagnosis, cognitive decoding, neuroimaging analysis, brain-computer interfaces (BCIs), and mechanistic brain science. BFMs incorporate architectural innovations to address the challenges of heterogeneous, high-dimensional, and artifact-prone brain data. Recent advances include robust handling of missing modalities, dynamic task adaptation, neurophysiological interpretability, and principled data-governance approaches. Empirical evaluations demonstrate state-of-the-art generalization on cross-subject BCIs, multimodal neuroimaging, pathology, affective decoding, and simulation of biological neural systems.
1. Core Definitions and Foundational Principles
A Brain Foundation Model is a parameterized encoder (or encoder–decoder) , where denotes multichannel time series (e.g., EEG, fMRI, or spikes). BFMs are typically pre-trained on large-scale, unlabeled datasets via self-supervised learning (SSL) objectives such as masked signal modeling, contrastive learning, or generative pretext tasks (Zhou et al., 1 Mar 2025, Shen et al., 12 Feb 2026). The universal representations enable few-shot and zero-shot transfer, supporting rapid adaptation to diverse downstream tasks and modalities.
Key Objectives:
- Encode canonical spatial, temporal, spectral, and cross-modal patterns of brain signals in device- and subject-agnostic latent spaces (Vainshtein et al., 28 Mar 2025, Ghamizi et al., 16 Jun 2025, Hanley et al., 23 Jan 2026).
- Achieve robust transfer—across tasks, subjects, acquisition protocols, and modalities—by leveraging SSL objectives that do not depend on specific downstream labels (Zhou et al., 1 Mar 2025, Altaheri et al., 19 Jun 2025, Shen et al., 12 Feb 2026).
- Form the substrate for parameter-efficient adaptation approaches, including prompt-tuning, adapter injection, and linear probing, with minimal labeled data (Zhou et al., 1 Mar 2025, Enda et al., 19 Jan 2025, Wu et al., 14 Jul 2025).
Foundational Features:
- Pre-training with diverse input types: EEG, iEEG, fMRI/BOLD, structural MRI, intracranial signals, text, behavioral data (Ghamizi et al., 16 Jun 2025, Dong et al., 29 Sep 2025, Shen et al., 12 Feb 2026).
- Architectural invariance to input length, channel layout, sampling rate, and subject characteristics (Luu et al., 4 Nov 2025, Liu et al., 30 Aug 2025).
- Compatibility with a spectrum of adaptation strategies, from full-model fine-tuning to lightweight task-specific heads (Vainshtein et al., 28 Mar 2025, Enda et al., 19 Jan 2025, Wu et al., 14 Jul 2025).
2. Model Architectures and Pretraining Objectives
BFM architectures are built on a variety of deep learning backbones, unified by their support for scalable self-supervised training and multi-task transfer (Ghamizi et al., 16 Jun 2025, Zhou et al., 1 Mar 2025, Shen et al., 12 Feb 2026).
Representative Architecture Classes:
- Patch/Token-based Transformers: Temporal, spatial, or spectrotemporal patches are linearly embedded and processed via multi-head self-attention. Models such as LaBraM, CBraMod, and BrainHarmonix employ variants of ViT-like or convolutional-transformer hybrids (Shama et al., 29 Jan 2026, Dong et al., 29 Sep 2025, Shen et al., 12 Feb 2026).
- 3D U-Nets and CNNs: Used for volumetric brain imaging (e.g., MRI/CT), these architectures capture spatial continuity and are extended for multi-task learning (Liu et al., 30 Aug 2025, Luu et al., 4 Nov 2025).
- Graph Neural Networks: Integrate anatomical or electrode topology for robustness to permutation and channel missingness (Shen et al., 12 Feb 2026).
- Autoencoders with Masking: Masked autoencoding, both in time and space, enforces context-aware reconstruction, e.g., MAEEG, BrainMAE, BrainFM-MRI (Luu et al., 4 Nov 2025, Dong et al., 29 Sep 2025).
- Contrastive and Generative SSL: InfoNCE-form contrastive objectives, masked reconstruction, and hybrid schemes (dual SSL) dominate the pretraining landscape (Altaheri et al., 19 Jun 2025, Shen et al., 12 Feb 2026, Wu et al., 14 Jul 2025).
Loss Functions and Schema:
- Masked-Signal Reconstruction: targets masked segments or patches (Wu et al., 14 Jul 2025).
- Contrastive Loss: InfoNCE objective aligns augmented samples via batch negatives.
- Variance–Covariance Regularization: Encourages feature decorrelation and non-collapse in high-dimensional SSL (Luu et al., 4 Nov 2025).
3. Transfer Protocols, Benchmarks, and Empirical Outcomes
BFMs are benchmarked on a variety of downstream protocols: cross-subject, multi-subject, few-shot, and zero-shot transfer on both neurophysiological signals and neuroimaging (Zhou et al., 1 Mar 2025, Shen et al., 12 Feb 2026, Wu et al., 14 Jul 2025).
Standardized Evaluation Frameworks:
- AdaBrain-Bench and Brain4FMs: Integrate 15+ representative models, 18+ standardized datasets (spanning EEG, iEEG, MRI, iEEG, BCI, emotion, disease diagnosis, and cognitive tasks), and unified metrics: balanced accuracy, macro F1, AUROC, , transfer score (Wu et al., 14 Jul 2025, Shen et al., 12 Feb 2026).
Empirical Observations:
- EEG and BCI: Large transformer-based BFMs (LaBraM, CBraMod, BIOT) consistently outperform traditional and “from scratch” baselines in cross-subject and few-shot settings. For cross-subject adaptation, LaBraM reaches up to 64.61% balanced accuracy (13 datasets), CBraMod 62.66%, compared to best traditional 58.12% (Wu et al., 14 Jul 2025).
- Neuroimaging: Modality-agnostic and dynamic-modality models (BrainFM, BrainFM-MRI, BrainHarmonix) achieve robust segmentation and synthesis performance across unseen MRI/CT contrasts and are resilient to missing input modalities (Luu et al., 4 Nov 2025, Dong et al., 29 Sep 2025, Liu et al., 30 Aug 2025).
- Pathology: Frozen foundation encoders with linear probing (ViT-based UNI, Prov-GigaPath) reach macro-recall >0.88 using as few as 10–25 histopathology patches per case in brain tumor classification. Full fine-tuning is frequently suboptimal due to catastrophic forgetting (Enda et al., 19 Jan 2025).
- Cognitive State and Mental Workload: Freezing the backbone and training small adaptation heads enables near real-time cognitive load estimation ( Pearson correlation, outperforming CNN/LSTM baselines) with rapid personalization (Shama et al., 29 Jan 2026).
4. Application Domains and Biological Relevance
BFMs enable a wide spectrum of neuroscience, clinical, and translational applications:
1. Brain–Computer Interfaces and Cognitive State Decoding:
- Robust cross-task and cross-user adaptation for motor imagery, emotion recognition, sleep staging, and workload monitoring (Zhou et al., 1 Mar 2025, Shama et al., 29 Jan 2026, Wu et al., 28 Jul 2025).
- Formal domain adaptation protocols leveraging BFM latent similarity metrics (e.g., Cauchy-Schwarz divergence) for efficient cross-subject adaptation and source selection (Wu et al., 28 Jul 2025).
2. Clinical Diagnostics and Neuroimaging:
- Universal encoders for brain MRI/CT that are robust to contrast, protocol, and missingness, outperforming single-task or calibration-specific U-Nets and CNNs (Luu et al., 4 Nov 2025, Liu et al., 30 Aug 2025, Ghamizi et al., 16 Jun 2025).
- Downstream linear probes or lightweight adapters suffice for disease discrimination and anatomical segmentation (e.g., tumor, MS lesion, Alzheimer’s) (Enda et al., 19 Jan 2025, Luu et al., 4 Nov 2025).
3. Cognitive and Neurobiological Insights:
- Multimodal BFMs can simulate brain-like response patterns and predict fMRI activation; models such as BrainHarmonix and multimodal contrastive transformers exhibit biologically aligned latent spaces and outperform unimodal counterparts in region-wise encoding analyses (Dong et al., 29 Sep 2025, Lu et al., 2022).
- Manifold analysis of BFM internal representations reveals modular transformation from retina-like to cortex-like dynamics, mapping to biological stages (feed-forward, recurrent, readout) (Bertram et al., 26 Nov 2025).
5. Advanced Adaptation, Interpretability, and Prompting
Task-Specific Tokens and Modular Adaptation:
- Methods such as Task Tokens introduce learnable encoders to modulate frozen BFMs for new control tasks, balancing between prompt engineering and dense reward learning (Vainshtein et al., 28 Mar 2025).
- Fine-tuning only small adaptation heads delivers parameter efficiency and preserves human-likeness and generality (Vainshtein et al., 28 Mar 2025).
Interpretability:
- BFM-based cognitive load pipelines employ Partition SHAP to attribute channel/region importance, revealing neurophysiologically plausible relevance patterns (e.g., dorsolateral prefrontal cortex in workload tasks) (Shama et al., 29 Jan 2026).
- Visualization and clustering of task/subject representations via t-SNE, PCA, and diffusion maps support biological interpretability and transfer diagnostics (Bertram et al., 26 Nov 2025, Shama et al., 29 Jan 2026, Zhou et al., 1 Mar 2025).
Multimodal and Prompt Tuning:
- Prompting and conditioning leverage text, joint targets, or user-specified priors, enabling multi-modal guidance and downstream task alignment without extensive retraining (Vainshtein et al., 28 Mar 2025).
6. Limitations, Data Governance, and Future Directions
Critical Limitations:
- Performance saturates with cohort and data size; zero-shot generalization, particularly for out-of-distribution and rare-class settings, remains limited (Shen et al., 12 Feb 2026).
- Interpretability and uncertainty quantification lag behind deployment requirements, especially in clinical and high-stakes applications (Ghamizi et al., 16 Jun 2025, Hanley et al., 23 Jan 2026).
- Current models depend on training data coverage; domain and task gaps in public datasets may propagate representational bias and demographic skew (Hanley et al., 23 Jan 2026).
Ethics, Privacy, and Governance:
- Neural data demands strict data governance, encompassing consent, privacy, bias audit, and procedural fairness. Membership inference, cross-context leakage, and disproportionate benefit accrual challenge the deployment of open BFMs (Hanley et al., 23 Jan 2026).
- Emerging safeguards include provenance tracking, controlled weight/API release, documentation standards, and benefit-sharing via data trusts.
Prospective Research Directions:
- Federated learning, privacy-preserving model update, and continual learning pipelines for multi-site, longitudinal, and population-scale neural data (Hanley et al., 23 Jan 2026, Altaheri et al., 19 Jun 2025).
- Instruction tuning and LLM interfaces for model–user communication, facilitating explainable and interactive neuro-AI (Shen et al., 12 Feb 2026).
- Deep integration of multi-modal neural, behavioral, and environmental data streams, extending beyond unimodal EEG/fMRI (Dong et al., 29 Sep 2025).
- Systematic neuroscientific benchmarking for structural, functional, and cognitive alignment—e.g., using neuroimaging-derived guidance in pretraining and evaluation (Donoso, 17 Jan 2026, Dong et al., 29 Sep 2025).
7. Summary Table: Architectural, Data, and Adaptation Taxonomy in Representative BFMs
| Model/Benchmark | Input Modalities | SSL Paradigm | Downstream Tasks | Notable Features |
|---|---|---|---|---|
| LaBraM/CBraMod | EEG (≥64ch), freq & time | Masked reconstruction | Workload, MI, emotion, sleep | Region pooling, flexible head tuning |
| BrainHarmonix | MRI 3D, fMRI time series | Masked AE + JEPA | Diagnosis, cognition | Multimodal 1D fusion, TR-adaptive |
| BrainFM-MRI | MRI (multi-sequence) | Masked AE + VICReg | Segmentation, classification | Dynamic modality integration, CLN |
| BrainFM (UNet) | MRI, CT (multi-contrast) | Multi-task, mild-severe synth | Synthesis, segmentation, reg | Robust to contrast, artifact, OOD |
| AdaBrain-Bench | EEG (non-invasive) | Masked/contrastive | 7 BCI domains | Cross/few-shot eval, transfer score |
| Brain4FMs | EEG, iEEG (clinical/HC) | Masked/contrastive | Diagnosis, cognitive, sleep | Plug-and-play API, spatial modeling |
| Multimodal CLIP | fMRI, image, text | Cross-modal contrast | Encoding alignment | ROI-level biological relevance |
All architectural and evaluation details are traceable to the cited sources.
References: (Zhou et al., 1 Mar 2025, Vainshtein et al., 28 Mar 2025, Ghamizi et al., 16 Jun 2025, Wu et al., 14 Jul 2025, Wu et al., 28 Jul 2025, Liu et al., 30 Aug 2025, Dong et al., 29 Sep 2025, Luu et al., 4 Nov 2025, Bertram et al., 26 Nov 2025, Donoso, 17 Jan 2026, Shama et al., 29 Jan 2026, Hanley et al., 23 Jan 2026, Shen et al., 12 Feb 2026, Lu et al., 2022, Cetin et al., 2024, Enda et al., 19 Jan 2025, Bobrin et al., 19 May 2025, Altaheri et al., 19 Jun 2025)