NeuroSSM: Bridging Brain Dynamics and ML

Updated 10 January 2026

NeuroSSM is a state-space modeling framework that bridges biological dynamics and machine learning for brain signal analysis.
It employs end-to-end differentiable multiscale modules and diagonal kernels to capture emergent neural phenomena such as time cells and ramping activity.
Its architecture demonstrates superior performance in fMRI and neural circuit analysis, offering both computational efficiency and biological interpretability.

NeuroSSM refers to a suite of state-space modeling frameworks explicitly developed to bridge the gap between biologically motivated dynamical systems and advanced machine learning models for brain signal analysis and cognitive representation. Recent research, with particular focus on end-to-end differentiable approaches for fMRI and neural circuit analysis, leverages diagonal, content-adaptive, and multiscale state-space modules to deliver efficient, interpretable, and biologically grounded models. The following sections explicate the mathematical foundations, architectural principles, emergent phenomena, task performance, comparative advantages, and broader implications of NeuroSSM frameworks, drawing on the latest arXiv literature (Genç et al., 3 Jan 2026, Lu et al., 18 Jul 2025, Park et al., 7 Feb 2025).

1. Mathematical and Biophysical Foundations

NeuroSSM models typify dynamical systems governed by first-order (or higher-order) ordinary differential equations (ODEs) in continuous or discrete time, where network state evolution parallels the biophysical processes of neural tissue. The canonical form—used both in cognitive timing tasks and fMRI analysis—is:

$\begin{aligned} h'(t) &= A h(t) + B x(t) \ y(t) &= C h(t) + D x(t) \end{aligned}$

Here, $h(t) \in \mathbb{R}^n$ denotes the vector of hidden states (interpretable as membrane potentials or subcellular gating variables), $x(t) \in \mathbb{R}^m$ is the input (e.g., external stimuli or BOLD signals), and matrices $A, B, C, D$ are learned parameters. The transition matrix $A$ is diagonalized (e.g., via HiPPO procedures) to promote interpretable, decoupled dynamical modes. The real part of the eigenvalues governs decay (“leak” or dissipation), while the imaginary part introduces oscillatory behavior, directly mirroring neuron-intrinsic currents.

For multiscale and discrete-time modeling, inputs are aggregated over varying timescales $\tau_k$ , and the core SSM is discretized as:

$h_{t+1} = \Phi(\Delta t) h_{t} + B_d x_{t+1}, \quad \Phi(\Delta t) = \exp(A \Delta t)$

Diagonal and content-adaptive SSM kernels allow for scalable parallelization and efficient gradient flow over long temporal sequences (Genç et al., 3 Jan 2026, Lu et al., 18 Jul 2025).

2. Emergence of Neural and Cognitive Phenomena

NeuroSSM frameworks produce a spectrum of emergent computational motifs long studied in systems neuroscience:

Time cells: When $|\exp(\lambda_i \Delta t)| \approx 1$ , purely imaginary eigenvalues induce rotational dynamics in the complex plane: $h_i(t+1) = e^{i \theta_i} h_i(t)$ , leading to sequentially activated states that tile delay intervals. Projected onto the real axis, these are formal analogues of hippocampal "time cells."
Ramping and traveling waves: For phases $\theta_i \approx \pi$ , monotonic ramping arises; for $\theta_i$ corresponding to multiple periods, multi-peaked, oscillatory, or traveling-wave activity emerges in the hidden state. All such behaviors result from the same diagonal oscillator principle in the SSM (Lu et al., 18 Jul 2025).
Scale invariance: Adjusting rotation speeds (phases) in response to delay periods yields stretching or compression of neural fields, replicating scale-invariant timing phenomena in cortical and hippocampal circuits (Lu et al., 18 Jul 2025).

The architectural mapping between hidden state dimensions and neuronal facets—such as mapping the real part of $h_i$ to membrane potential—enables direct correspondence between mechanistic neural models and abstract SSM dynamics.

3. NeuroSSM Architectures for Brain Signal Analysis

Recent advances showcase NeuroSSM models configured as end-to-end pipelines for neuroimaging and neural data, with both multiscale and differencing mechanisms:

Multiscale State-Space Backbone: Parallel SSMs operate at different timescales $\tau_k$ , permitting modeling of both rapid transients and slower baseline fluctuations in signals such as BOLD fMRI (Genç et al., 3 Jan 2026). Each scale processes either raw or differenced input segments, with learned diagonal transition and input matrices dynamically modulated via small multilayer perceptrons conditioned on the input.
Parallel Differencing Branch: By explicitly feeding first-order temporal differences of input representations at each scale into SSM modules, sensitivity to abrupt changes or transient events is enhanced. Fusion of outputs from original and differenced paths provides linear-time, context-aware representation without sacrificing computational tractability.
Across-Scale Fusion and Residual Normalization: The outputs from different timescales are summed and post-processed with layer normalization and nonlinearity (e.g., GeLU), before being pooled and passed to linear classification heads for task-specific predictions (Genç et al., 3 Jan 2026).

In reinforcement learning applications, SSM outputs pass through a biologically motivated spiking nonlinearity (thresholding $\mathrm{Re}\,h_i$ ) and supply actor-critic modules for policy and value estimation, optimized via A3C protocols (Lu et al., 18 Jul 2025).

4. Task Performance and Computational Efficiency

NeuroSSM demonstrates superior or state-of-the-art performance across a diverse range of brain-signal decoding and cognitive benchmarking tasks:

Dataset	Task	NeuroSSM Acc. / F1 / AUC (%)	Best Baseline
HCP-Rest	Gender Classification	81.8 ± 4.0 / 74.7 ± 8.6 / 89.8 ± 2.0	BolT: 78.3 / 70.2 / 87.5
HCP-Task	Cognitive Tasks (7-class)	87.8 ± 2.7 / 87.2 ± 2.5 / 99.1 ± 0.3	BolT: 83.5 / 82.3 / 98.9
PPMI	Parkinson’s vs. Control	86.4 ± 3.6 / 92.6 ± 2.2 / 67.5 ± 17.8	LSTM: 82.8 / 89.9 / 61.7

On both clinical and non-clinical fMRI datasets, NeuroSSM architectures outperform LSTM, CNN, sparse attention transformers, and alternative state-space models in terms of accuracy, F1, and area-under-curve metrics (Genç et al., 3 Jan 2026). Ablation demonstrates both multiscale and differencing branches contribute independently to predictive performance.

Computationally, NeuroSSM maintains $O(NT)$ scaling in both time and memory (with constants dependent on the number of scales and state dimension), enabling efficient deployment on high-volume datasets. Empirical throughput on modern GPUs surpasses windowed transformer models at similar model capacity (Genç et al., 3 Jan 2026).

5. Theoretical Significance and Biological Interpretability

NeuroSSM structures serve as a mechanistic bridge between biophysically detailed ODE models of neurons and abstract representations necessary for high-level cognitive inference. Specifically:

The core SSM differential operators echo Hodgkin–Huxley–type dynamics for membrane potentials and gating.
Diagonalization enables unit-wise, interpretable oscillatory and leaky integrator motifs, aligning with observed time/ramping/oscillatory cells across multiple experimental paradigms (Lu et al., 18 Jul 2025).
The ability to reproduce time cells, ramping cells, traveling waves, and scale-invariant fields from a single mathematical principle unifies disparate phenomenology from experimental neuroscience within a computationally tractable and scalable architecture.

A plausible implication is that NeuroSSM offers a formalism for capturing both microcircuit-level principles and system-level cognitive outcomes, enabling exploration of temporal coding, working memory, and event abstraction within a single framework.

While foundational models such as stochastic optimal control-based SSMs (see (Park et al., 7 Feb 2025)) exploit similar latent dynamical systems and amortized inference, they have focused on variational control and transfer-learning properties rather than directly modeling emergent cell phenomena. Other SSM architectures, such as SHaRe-SSM for event-based long-sequence modeling and Cortical-SSM for EEG/ECoG decoding, share the diagonal/oscillatory state-space principle and have demonstrated advantages in energy efficiency, parallel computation ( $O(\log L)$ ), and interpretability across modalities (Agrawal et al., 16 Oct 2025, Suzuki et al., 17 Oct 2025).

A key limitation of current NeuroSSM instantiations is the reliance on ROI-level parcellation and layer normalization in fMRI pipelines, and the lack of explicit subject- or session-invariance mechanisms for broader deployment. Continual work is required to address signal-to-noise heterogeneity, nonstationarity, and domain adaptation across recording platforms.

7. Impact and Outlook

NeuroSSM frameworks have validated the utility of multiscale, diagonal-kernel state-space architectures for context-aware and interpretable modeling of complex neural and behavioral phenomena. The demonstrated efficiency and alignment with biophysical processes suggest wide applicability in both systems neuroscience and clinical neuroimaging. Ongoing research aims to integrate more realistic neuronal nonlinearities, scale to finer spatial/temporal resolutions, and generalize across recording modalities and clinical populations (Genç et al., 3 Jan 2026, Lu et al., 18 Jul 2025, Park et al., 7 Feb 2025).

A plausible implication is that these developments will facilitate unified, mechanistically grounded models for understanding temporally-extended cognition and neural population coding, while enabling practical advances in clinical diagnostics and brain–machine interfaces.