State-Space Models (SSMs) Overview

Updated 28 November 2025

State-Space Models (SSMs) are mathematical frameworks that represent dynamical systems using latent states, noise, and system dynamics, essential in control theory, time series, and machine learning.
Recent advances include deep structured SSMs with principled parameterizations, selective gating mechanisms, and compute-in-memory neuromorphic architectures for efficient long-sequence processing.
Robust inference methods, from Kalman filtering in linear-Gaussian cases to particle and variational approaches in nonlinear models, enable accurate state estimation and dynamic prediction.

State-space models (SSMs) are a general mathematical formalism for representing, analyzing, and learning dynamical systems observed through time, in which a latent (unobserved) state evolves under known or learned dynamics and produces observable outputs, often with both process and observation noise. SSMs unify a diverse range of models across control theory, statistics, time series analysis, and machine learning. Recent advances include deep structured SSMs for long sequence processing, efficient algorithmic parameterizations supporting stable training, probabilistic SSMs leveraging deep inference schemes, and compute-in-memory neuromorphic architectures (Zhang et al., 17 Nov 2025, Hargreaves et al., 29 May 2025, Massai et al., 31 Mar 2025, Zubić et al., 17 Dec 2024).

1. Mathematical Foundations of State-Space Models

The standard continuous-time linear SSM is formulated as: $\frac{dx(t)}{dt} = A x(t) + B u(t), \qquad y(t) = C x(t) + D u(t)$ with $x(t) \in \mathbb{R}^N$ (state), $u(t)\in\mathbb{R}^M$ (input), $y(t)\in\mathbb{R}^P$ (output), and system matrices $A,B,C,D$ of appropriate size (Zhang et al., 17 Nov 2025, Lv et al., 14 Mar 2025). Discretization with interval $\Delta$ yields: $x_{k+1} = \bar{A} x_k + \bar{B} u_k, \qquad y_k = C x_k + D u_k$ where, e.g., for zero-order hold,

$\bar{A} = \exp(A\Delta), \quad \bar{B} = \left(A^{-1} (\exp(A\Delta)-I)\right)B$

The SSM formalism extends beyond linear-Gaussian cases to nonlinear, non-Gaussian transitions and emissions: $x_t \sim p_\theta(x_t \mid x_{t-1}), \qquad y_t \sim p_\theta(y_t \mid x_t)$ (Hargreaves et al., 29 May 2025, Lin et al., 15 Dec 2024).

The convolutional form,

$y_k = \sum_{n=0}^k C \bar{A}^n B\, u_{k-n}$

is central to modern SSM layers in deep learning (Lv et al., 14 Mar 2025).

2. Parametric and Structural Advances

Structured and Selective SSMs: SSMs have been extended well beyond classical forms via principled initialization and structured parameterizations.

HiPPO and S4 Family: The HiPPO framework chooses $A,B$ so that the state projects the recent input history onto an orthogonal basis (e.g., scaled Legendre polynomials), as in S4. This enables uniform memory coverage over long sequences and stable gradient propagation (Gu et al., 2022, Babaei et al., 13 May 2025).
SaFARi: Further generalizes the basis used in the state projection to arbitrary frames/bases, retaining tractable linear ODEs for the coefficients (Babaei et al., 13 May 2025).
Diagonal/Low-Rank Parameterizations (DPLR): SSMs with $\bar{A} = \Lambda - PQ^T$ combine diagonal structure for fast scanning and low-rank corrections for expressivity (Lv et al., 14 Mar 2025).
Selective SSMs (e.g., Mamba): Make $A$ and related matrices input-dependent via gating, allowing the state update to integrate multiplicative, content-dependent interactions and enhance potential expressive power (Cirone et al., 29 Feb 2024, Lv et al., 14 Mar 2025).

Stability and Robustness Guarantees: Recent work provides free parametrizations ensuring prescribed $\mathcal{L}_2$ input-output gain bounds, enabling unconstrained optimization with guaranteed stability (L2RU block) (Massai et al., 31 Mar 2025).

3. Inference and Learning Algorithms

Linear–Gaussian Cases:

Kalman filtering and smoothing provide closed-form, recursive solutions for filtering, prediction, and likelihood evaluation in linear-Gaussian SSMs. The Kalman filter is implemented for efficient maximum-likelihood estimation and state inference (Hargreaves et al., 29 May 2025, Elvira et al., 2022, Auger-Méthé et al., 2020).

Nonlinear/Non-Gaussian and Deep SSMs:

Sequential Monte Carlo (Particle Filtering): SSMProblems.jl’s GeneralisedFilters.jl and other frameworks support particle filters, bootstrap filters, and hybrid Rao-Blackwellized algorithms for approximate filtering/smoothing when models are nonlinear or non-Gaussian (Hargreaves et al., 29 May 2025, Ryder et al., 2018).
Variational Inference: Probabilistic Recurrent SSMs (PR-SSM) and VAE-based neural SSMs use variational methods (ELBO maximization, black-box autoregressive flows, amortized recognition models) for scalable learning in high-dimensional or deep latent SSMs (Doerr et al., 2018, Lin et al., 15 Dec 2024).
Maximum Approximate Likelihood: For continuous-time SSMs with nonlinear/non-Gaussian structure, fine discretization converts the state process to a hidden Markov model, enabling likelihood maximization and state decoding via adapted HMM algorithms (Mews et al., 2020).

Software Ecosystem: SSMProblems.jl/GeneralisedFilters.jl (Turing.jl) enable highly modular, GPU-accelerated experimentation with arbitrary SSMs using consistent interfaces that span exact/approximate inference (Hargreaves et al., 29 May 2025).

4. Memory, Expressivity, and Computational Constraints

Memory Compression and Expressivity:

Selective SSMs adopt gating mechanisms performing dynamic, element-wise filtering of the state update. Theoretical results link memory efficiency and information retention via mutual information and rate-distortion frameworks, with formalized bounds on achievable compression at given distortion (Bhat, 4 Oct 2024). Contraction mapping results guarantee stability and ergodicity under suitable Lipschitz and spectral norm conditions.

Expressivity Limitations:

While SSMs “look” recurrent, theoretical circuit-complexity analyses reveal that standard (fixed-depth, diagonal or constant) SSM layers are restricted to $\mathsf{TC}^0$ expressivity, thus incapable of modeling $NC^1$ -hard state-tracking tasks (e.g., general permutation composition) without input-dependent transitions or deep stacking (Merrill et al., 12 Apr 2024). Selective SSMs (e.g., Mamba) can provably project inputs onto high-order path signatures—capturing nonlinear interactions—when their gates are input-dependent, as formalized via Linear Controlled Differential Equations (CDEs) and rough path theory (Cirone et al., 29 Feb 2024).

5. Hardware Implementations and Efficiency

Compute-in-Memory (CIM) Realization:

Recent developments achieve fully asynchronous, real-time SSM inference directly in memristive CIM hardware. Here, passive device dynamics instantiate the state decay, and hardware crossbars implement the input injection—realizing the SSM recurrence “in physics.” Device-level calibration is required, but the design attains high energy efficiency and ultra-low-latency operation for streaming event-based data, with empirical accuracy competitive or superior to SNNs and CNNs (Zhang et al., 17 Nov 2025).

Benchmark	Accuracy (SSM CIM)	Parameters (M)	Energy (mW)
SHD (audio)	95.7%	0.3
SSC (audio)	84.7%	0.6
DVS128 Gesture	97.3%	5.0	$\approx$ 34
DVS128 Lips	63.5%	5.7

Further reductions in hardware complexity are achieved by sharing a single decay constant per SSM block; the co-design approach allows both high accuracy and aggressive energy optimization (Zhang et al., 17 Nov 2025).

Computational Complexity:

Modern SSM implementations using diagonal or structured parameterizations and FFT-based convolutional recurrences achieve per-layer complexity $O(N(L+\log L))$ , rivaling or beating attention-based architectures in scaling and memory, particularly for long sequence lengths (Lv et al., 14 Mar 2025, Zhang et al., 2023).

6. Applications and Performance in Diverse Domains

SSMs and their deep generalizations have established state-of-the-art results in sequence modeling tasks across domains:

Time series forecasting and classification: SpaceTime’s SSM layers outperform prior methods in long-horizon forecasting and match or exceed predictive accuracy for ECG, audio, and Informer benchmarks, demonstrating architectural expressivity and scalable training/inference (Zhang et al., 2023).
Computer vision and multimodal data: Event-driven SSMs, including CIM hardware realizations, process asynchronous event camera data with substantial reductions in performance degradation at higher inference rates compared to transformer or RNN-based architectures (Zubić et al., 23 Feb 2024, Zhang et al., 17 Nov 2025).
Image, video, and multivariate sequence modeling: Graph-generating SSMs dynamically construct sparse propagation graphs reflecting latent feature relationships, achieving state-of-the-art results on ImageNet, optical flow, and multivariate time-series datasets, outperforming both previous fixed-scan SSMs and transformers (Zubić et al., 17 Dec 2024).

7. Limitations, Open Issues, and Future Directions

Known Limitations:

Estimation and identifiability: Even the simplest linear-Gaussian SSMs can be non-identifiable or ill-posed in high-measurement-noise regimes; diagnostic tools, informative priors, and model constraints are essential (Auger-Méthé et al., 2015).
Expressivity boundaries: Standard SSMs cannot, without depth-scaling or input-dependent recurrences, solve general-purpose state-tracking tasks outside $\mathsf{TC}^0$ (Merrill et al., 12 Apr 2024).

Extensions and Active Areas:

Robust, stability-guaranteed training via parametrizations (L2RU, spectral constraints) (Massai et al., 31 Mar 2025).
Advanced regularization and initialization based on data-dependent generalization bounds, yielding improved performance and stable optimization (Liu et al., 4 May 2024).
Modular, composable SSM software layers enable the rapid evaluation and deployment of exact and approximate inference across heterogeneous dynamical systems (Hargreaves et al., 29 May 2025).
Expansion to graph-based propagation (GG-SSMs), higher-dimensional scanning, and multi-modal or irregularly-sampled data (Zubić et al., 17 Dec 2024).

Hardware and Neuromorphic Directions:

Continued integration of SSM-algorithmic simplification and device physics targeting end-to-end, fully analog realization for energy-efficient, ultra-low-latency computation remains an active topic (Zhang et al., 17 Nov 2025).

These developments position SSMs as a versatile, theoretically grounded, and practically efficient alternative foundation for emerging long-context, adaptive, and energy-sensitive sequence processing tasks across disciplines.