Papers
Topics
Authors
Recent
2000 character limit reached

Brain-Grounded Axes: Neural Representations

Updated 24 December 2025
  • Brain-grounded axes are interpretable dimensions derived from intrinsic neural activity that provide an internal coordinate system for analyzing neural representations.
  • They are extracted using methods such as PCA, ICA, and topological models, ensuring neurophysiological validity and reproducibility across different data modalities.
  • These axes have practical applications in LLM interpretability, spatial cognition, and neural modeling, offering actionable insights for controllable representation learning.

Brain-grounded axes are dimensions within neural, cognitive, or artificial representational spaces that are derived from brain activity itself, rather than imposed from external or behavioral supervision. These axes offer an intrinsic coordinate system rooted in the organization of neural states, and provide interpretable handles for analysis and manipulation of both biological and artificial neural representations. Their extraction, validation, and application span multiple domains, from LLM interpretability to spatial cognition and topological models of cortical activity.

1. Fundamental Concepts and Definitions

A brain-grounded axis is an interpretable direction in a representational state space, such that the axis is computed or validated using features, latent variables, or algebraic structures emerging directly from neural data rather than externally defined labels. For example, Tozzi & Peters define axes on a 3-sphere using resting-state fMRI activity mapped via stereographic projection to ℝ⁴, with axes corresponding to orthogonal basis vectors in ℝ⁴ and associated with pairs of large-scale brain networks or functional domains (Tozzi et al., 2015). In cognitive computational models, axes may correspond to geometric or topological dimensions defined by coactivity of neural assemblies, as in the orientation space model of place and head-direction cells (Dabaghian, 2021). In artificial intelligence, "brain-grounded axes" have recently been operationalized by constructing coordinate systems from magnetoencephalography (MEG) data, aligning LLM hidden states with brain-derived axes to achieve interpretability and steerability (Andric, 22 Dec 2025).

The defining feature is that these axes have neurophysiological validity, being discoverable or characterizable purely from the dynamics or structure of neural signals, and often correspond to functionally salient cognitive or perceptual variables.

2. Extraction and Construction Methodologies

The construction of brain-grounded axes typically involves a pipeline combining dimensionality reduction, latent variable extraction, and neurophysiological alignment:

  1. Data acquisition and pre-processing: High-dimensional neural data is acquired (e.g., fMRI, MEG, EEG), preprocessed (band-limited, artifact rejection), and co-registered with relevant events (e.g., word onsets for language, video frames for perception).
  2. Feature construction: Neural state descriptors are computed, such as phase-locking value (PLV) matrices in MEG (128-dimensional via PCA) (Andric, 22 Dec 2025), coactivity complexes for place/head-direction assemblies (Dabaghian, 2021), or BOLD center-of-mass coordinates in hyperspherical models (Tozzi et al., 2015).
  3. Latent axis extraction: Dimensionality reduction and source separation techniques such as independent component analysis (ICA) are applied to the neural fingerprints, yielding a set of statistically independent, interpretable axes. Each axis corresponds to a direction in neural space where variation is maximally non-Gaussian and functionally meaningful (e.g., lexical frequency, function/content, animacy) (Andric, 22 Dec 2025).
  4. Algebraic/structural definitions: In topological and algebraic models, axes can be defined as equivalence classes of neural coactivity (parallel "lines" correspond to persistent activation under fixed head-direction assemblies) (Dabaghian, 2021) or as commutative, invertible transformation modules in latent space (for separable qualia spaces) (Ohmura et al., 2023).

An example of the construction workflow for brain-derived axes in LLM interpretability is summarized below:

Step Method Output
MEG collection 12 subjects, 60 naturalistic stories, 306 sensor time series, band-limited Sensor × time MEG matrices
PLV computation Hilbert transform -> sliding window -> PLV_{ij} = (1/T) ∑_{t=1}T e{i(φ_i(t)-φ_j(t))}
Atlas construction Ridge regression predict PLV from linguistic features (e.g. GPT embedding-change, logfreq, POS). Averaging across windows/runs yields 128-dim "brain fingerprint" Word × 128 matrix
Axis extraction ICA (FastICA, 20 components) on stack of fingerprints 20 brain-derived axes (directions in PLV-PCA space)

This method produces a stable, cross-subject generalizable set of latent axes that can be mapped to categorical, lexical, or semantic features (Andric, 22 Dec 2025).

3. Validation and Functional Interpretation

A central requirement for brain-grounded axes is external validation: the axes must correlate with independently characterized features, and demonstrate robustness across data splits, controls, and experimental variations.

  • Lexical and semantic validation: ICA axes extracted from MEG-derived word fingerprints in (Andric, 22 Dec 2025) were validated against independent lexica: axis 15 correlates r=0.51 with log-frequency, axis 2 indexes animacy (Cohen's d=0.53 after covariate correction), and other axes track valence/arousal, function/content, or concreteness. Out-of-fold stability of axes was |r|=0.82–0.97.
  • Control for circularity: Atlases rebuilt without embedding-change or using non-GPT embeddings (word2vec) yielded matched axes with |r|=0.64–0.95, indicating that axes are not driven by idiosyncrasies of linguistic model features alone.
  • Spatial axis specificity: In EEG-based 6D pose decoding (Dai et al., 16 Jul 2025), gradient-based channel attribution identifies spatial axes where position correlates with central-parietal activity and orientation with lateral occipito-temporal electrodes, aligning emerging axes with spatial computational roles.
  • Topological/geometric formalization: Finite affine geometry models in hippocampal mapping (Dabaghian, 2021) show how "lines" (axes) follow from persistent activation under fixed head-direction assemblies, satisfying axioms of parallelism, collinearity, and incidence, with "pose-simplices" explicitly encoding orientation.

The functional interpretation assigns cognitive or semantic content to each axis: e.g., frequency axes modulate lexical accessibility, function/content axes reflect syntactic choice, spatial axes encode egocentric 6D pose, and hyperspherical axes index macro-scale network antagonism or temporal flow (Tozzi et al., 2015).

4. Applications in Artificial and Biological Systems

Brain-grounded axes have been directly leveraged for both reading and steering neural state representations, as well as for theoretical models of neural coding:

  • LLM interpretability and control: By training lightweight ridge-regression adapters to predict brain-derived axis values from LLM hidden states (without model fine-tuning), it becomes possible to map the internal state of an LLM (e.g., TinyLlama, Qwen2-0.5B, GPT-2) onto neurophysiologically validated coordinate axes. Steering generation by adding scaled vectors along axis k (h'_t = h_t + αv_k) in mid or high LLM layers yields controllable lexical/semantic outcomes. Manipulation along the frequency axis systematically increases mean log-frequency and improves perplexity relative to random or text-only steering controls. Function/content axes affect text-level part-of-speech statistics across models and layers (Andric, 22 Dec 2025).
  • Model robustness and axis structure: In ANN-based brain models, McNeal & Ratan Murty demonstrate that adversarial probes reveal distinct local coding axes, with robustified models exhibiting stable, semantically meaningful and transferable axes, unlike the fragile, architecture-specific axes of non-robust models (McNeal et al., 27 Sep 2025). These directions are identified via the local Jacobian of voxel-wise response predictions and are quantitatively evaluated through principal angle cosine and transfer energy.
  • Spatial cognition and geometry: Discrete geometry models show how place and head-direction cell coactivity defines a finite set of "axis directions" supporting path planning, vector navigation, and learning in topological/affine spaces (Dabaghian, 2021). EEG and fMRI approaches recover spontaneous brain-grounded spatial axes that align with position and orientation (Dai et al., 16 Jul 2025).
  • Qualia discrimination and subspace factorization: Algebraic independence of neural transformation modules in latent space enables autonomous splitting of representations into irreducible metric subspaces associated with distinct qualia types (e.g., color, shape), operationalizing brain-grounded axes as commutative, invertible latent actions (Ohmura et al., 2023).

5. Stability, Robustness, and Cross-Validation

A key technical property of brain-grounded axes is their reproducibility and independence from data-specific or model-specific features:

  • Stability to feature sets: Axes constructed with or without particular linguistic features, or from alternative embedding spaces, remain highly correlated (e.g., |r| up to 0.95 vs. base atlas), with only explicit POS-driven axes collapsing when relevant features are dropped (Andric, 22 Dec 2025).
  • Cross-subject generalizability: Replication of axes across subject splits, with 12/20 axes showing |r|>0.05 between odd and even groups, supports the external validity of the axes and their population-level relevance.
  • Robustness to leakage and artifact: Use of controls such as wPLI (weighted phase-lag index, leakage-insensitive) demonstrates that some axes (e.g., frequency-linked) persist beyond linear artifacts, whereas others attenuate, indicating the specificity of certain brain-grounded axes to underlying neural mechanisms.
  • Physiological anchoring: Exploratory fMRI analyses connect brain-derived axes with BOLD-level correlates (e.g., embedding-change and log-frequency predictors), but with reliability only at the population level, highlighting the need for further multimodal validation.

6. Impact and Future Directions

The extraction and use of brain-grounded axes advances representation learning, neural modeling, and interpretable AI by providing:

  • A neurophysiologically justified, externally anchored coordinate system for the analysis and control of both biological and artificial agents.
  • New methodologies for mapping representational manifolds in LLMs or ANN-based models, circumventing the limitations of purely text-derived, label-driven, or behaviorally anchored supervision.
  • Quantitative metrics for the alignment of artificial models with biological systems, particularly via the geometric and topological structure of emergent coding axes.
  • Theoretical frameworks (e.g., algebraic independence, finite affine geometry, hyperspherical embedding) for autonomous development of modular, discriminative representations matching cognitive modules or qualia types.

Open questions center on the extensibility of brain-grounded axis frameworks across cognitive domains, improved multimodal anchoring (combining MEG, fMRI, and EEG), and scaling to unsupervised or online settings. The approach provides a foundation for interpretable, controllable, and brain-consistent representation learning in next-generation neuroscience and AI systems (Andric, 22 Dec 2025, McNeal et al., 27 Sep 2025, Dai et al., 16 Jul 2025, Dabaghian, 2021, Ohmura et al., 2023, Tozzi et al., 2015).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Brain-Grounded Axes.