Gaussian Process State-Space Models
- GPSSMs are a flexible nonparametric framework that model nonlinear dynamical systems by combining Gaussian processes with state-space representations.
- They leverage diverse inference methods—including variational, sampling, and EM approaches—to efficiently manage uncertainty and extract system dynamics.
- Recent extensions, such as normalizing flows and multi-resolution techniques, enhance expressivity and scalability in high-dimensional and nonstationary settings.
Gaussian Process State-Space Models (GPSSMs) are a flexible, nonparametric framework for learning, inference, and prediction in nonlinear dynamical systems. They combine the expressive power of Gaussian processes (GPs) for capturing unknown system dynamics with the probabilistic state-space modeling paradigm, enabling both uncertainty quantification and principled Bayesian reasoning across a range of real-world and synthetic application domains.
1. Model Structure and Foundational Concepts
A GPSSM models the evolution of a latent state via an unknown, possibly nonlinear, Markov transition: where is modeled as a draw from a GP prior, and denote process and measurement noise, and is a measurement function, which may also be modeled as a GP. The GP prior,
imposes smoothness (or other structural) assumptions via the choice of kernel . This nonparametric prior enables GPSSMs to represent highly flexible, data-adaptive dynamics, as opposed to fixed parametric models.
Most classical and modern GPSSM formulations assume conditional independence between transition outputs. More recent work extends this to multi-output, output-dependent GPs using constructions such as the Linear Model of Coregionalization (LMC), enabling the model to capture dependencies between the channels of the transition vector (Lin et al., 2022).
2. Inference Methods: Marginalization, Variational, and Sampling Approaches
Inference in GPSSMs is challenging due to the strong temporal dependencies between the latent states and the high-dimensional, infinite-dimensional GP function space. The following taxonomy captures the main approaches:
- Marginalization over the GP function: For some formulations, can be marginalized analytically, yielding effective non-Markovian priors over state trajectories and reducing function inference to state trajectory sampling (Frigola et al., 2013, Särkkä, 2019).
- Stochastic Approximation EM: The particle stochastic approximation EM (PSAEM) framework combines a stochastic (noisy) EM auxiliary function with particle MCMC, specifically the particle Gibbs with ancestor sampling (PGAS), for efficient smoothing and M-step parameter updates (Frigola et al., 2013).
- Variational Inference: The dominant approach in recent work, using either mean-field or more complex structured posteriors over the latent states, GP function values, and (optionally) variational distributions over inducing points to manage computational cost (Frigola et al., 2014, Eleftheriadis et al., 2017, Ialongo et al., 2018, Fan et al., 2023).
- Inducing Points and Sparse GP Approximations: Inducing point variables are introduced so that can be factored through finite-dimensional Gaussian distributions over ; this makes inference and optimization scalable to long time series (Frigola et al., 2014, Ialongo et al., 2018, Lin et al., 2023).
- Particle MCMC and Sequential Monte Carlo: When variational inference is infeasible or inaccurate (due to multimodal or highly non-Gaussian posteriors), PMCMC approaches such as particle Gibbs are used to target the joint smoothing distribution (Frigola et al., 2013).
Tabular summary of prominent approaches: | Inference Approach | Key Elements | References | |---------------------------|--------------------------------------------|--------------------------| | Particle EM (PSAEM + PGAS)| stochastic EM, PMCMC smoothing | (Frigola et al., 2013) | | Mean-field VI | factorized q(x), stochastic gradient ascent| (Frigola et al., 2014, Ialongo et al., 2018) | | Free-form VI (SGHMC) | non-factorized q(x, u), SGHMC sampling | (Fan et al., 2023) | | EnKF-aided VI | non-mean-field, recursive, EnKF | (Lin et al., 2023, Zheng et al., 22 Nov 2024) | | Recursive/Moment-matching | EKF/UKF/ADF, inducing management | (Zheng et al., 22 Nov 2024, Zheng et al., 17 Oct 2025) |
3. Model Extensions: Expressivity, Multi-output, and Nonstationarity
Standard GPSSMs, while highly flexible in the 1D case, suffer from limited expressivity in high-dimensional or nonstationary settings. Recent research has introduced several enhancements:
- Transformed GPSSMs (TGPSSM) use parametric, invertible normalizing flows to “push forward” the GP prior, creating a more expressive stochastic process:
which can represent non-Gaussian marginals and handle sharp transitions or heavy-tailed dynamics (Lin et al., 2023, Lin et al., 2023, Lin et al., 24 Mar 2025).
- Efficient Transformed GPSSMs (ETGPSSM) further reduce complexity by coupling a single GP with dimension-specific or input-dependent flows, rather than independent GPs for each output. The resulting architecture both captures output correlation and significantly reduces parameter count and computational cost (Lin et al., 2023, Lin et al., 24 Mar 2025).
- Heterogeneous Multi-output Kernels: To model multi-channel dynamics where each channel may exhibit different lengthscales, structure, or inputs, a block-diagonal kernel matrix is constructed so that each output uses its own kernel, input mapping, and hyperparameters:
and the inducing point management is carried out independently per output (Zheng et al., 17 Oct 2025).
- Multi-resolution and Multi-timescale GPSSMs: By decomposing the latent state into components modeled at different time resolutions/scales (e.g., fast vs. slow dynamics), more realistic long-sequence inference becomes feasible (Longi et al., 2021).
4. Online and Recursive Learning, Inducing Point Management
GPSSMs have been extended to handle streaming, online data via recursive and online inference:
- Recursive Updates with Inducing Point Selection: Recursive GPSSMs maintain a joint distribution over the state and a dynamically managed inducing set , performing a prediction-correction cycle via first-order linearization or higher-order moment matching (EKF, UKF, ADF) (Zheng et al., 22 Nov 2024, Zheng et al., 17 Oct 2025). Inducing points are added based on novelty (conditional variance) and pruned by minimizing KL divergence, executed independently per output dimension in heterogeneous kernel settings.
- Online Hyperparameter Adaptation: Hyperparameters are updated online by recovering historical measurement information from the current filtering distribution, ensuring fast adaptation to nonstationarity without storing the full data history (Zheng et al., 22 Nov 2024).
- Ensemble Kalman Filtering (EnKF) Integration: Non-mean-field variational inference can be realized by using the EnKF to propagate state uncertainty: particles are propagated and updated using GP predictions and Kalman gain from the emission model, resulting in efficient, closed-form approximations to the variational posterior and ELBO. This is well-suited for online scenarios (Lin et al., 2023, Lin et al., 24 Mar 2025).
5. Expressivity, Limitations, and Stability
While GPSSMs provide uncertainty quantification and flexible function learning, their representational capacity and stability depend on choices in kernel functions and modeling:
- Expressivity Limitations: Standard GPSSMs with stationary kernels (SE, Matérn) cannot capture nonstationary or non-Gaussian state transitions; transformed GPSSMs with normalizing flows or LMC-based output dependency overcome this constraint (Lin et al., 2023, Lin et al., 2022).
- Stability and Boundedness: For SE kernels, deterministic GPSSMs are globally uniformly ultimately bounded and do not admit unbounded trajectories; they always possess at least one equilibrium and have stability properties that can be analyzed analytically (Beckers et al., 2018, Beckers et al., 2018). For stochastic GPSSMs, mean square boundedness and positive recurrence hold for SE kernels, but the expressivity is limited to bounded dynamics, which is a fundamental modeling assumption.
- Active Learning and Out-of-Distribution Detection: Mutual information–based active learning (AL) strategies for input selection significantly enhance data efficiency in dynamical system learning (Yu et al., 2021). Embedding domain knowledge via informed kernels improves prediction under scarce data and supports online out-of-distribution detection, critical for safety in robotics (Marco et al., 2023).
6. Application Domains and Empirical Performance
GPSSMs are widely applied for:
- System identification in robotics, control, and aerospace (e.g., quadrotors, UAVs, hypersonic vehicles)
- Forecasting, time-series modeling, and simulation in finance, neuroscience, and engineering
- Adaptive and robust model predictive control with uncertainty quantification
- Online and real-time learning for changing environments
Empirical evaluations across synthetic kink-functions, chaotic Lorenz systems, real-world actuator dynamics, and atmospheric testbeds consistently show that recent recursive, output-dependent, and normalizing flow–based GPSSMs achieve superior learning accuracy and runtime efficiency, often matching state-of-the-art offline methods with orders of magnitude less runtime and scaling robustly to high-dimensional, noisy, and partially observed systems (Zheng et al., 17 Oct 2025, Lin et al., 24 Mar 2025, Zheng et al., 22 Nov 2024).
7. Mathematical and Algorithmic Summary
Central equations and concepts:
- Marginal likelihood:
where are latent states, is marginalized.
- SAEM surrogate update (Frigola et al., 2013):
- Variational lower bound (ELBO) (Frigola et al., 2014, Fan et al., 2023):
with structured mean-field or free-form .
- Normalizing Flow–transformed GP prior (Lin et al., 2023):
- EnKF state update (Lin et al., 2023):
- Moment Matching for Heterogeneous Outputs (Zheng et al., 17 Oct 2025):
References
- (Frigola et al., 2013) Särkkä, Solin, & Hartikainen, "Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM"
- (Frigola et al., 2014) Frigola, Chen, & Rasmussen, "Variational Gaussian Process State-Space Models"
- (Eleftheriadis et al., 2017) Doerr, Daniel, & Schölkopf, "Identification of Gaussian Process State Space Models"
- (Beckers et al., 2018, Beckers et al., 2018) Beckers & Hirche, "Stability of Gaussian Process State Space Models"; "Equilibrium Distributions and Stability Analysis of GPSSMs"
- (Ialongo et al., 2018) Eleftheriadis et al., "Closed-form Inference and Prediction in GPSSMs"
- (Yu et al., 2021) Zhang et al., "Active Learning in Gaussian Process State Space Model"
- (Longi et al., 2021) Dohmatob et al., "Traversing Time with Multi-Resolution Gaussian Process State-Space Models"
- (Lin et al., 2022) Chen et al., "Output-Dependent Gaussian Process State-Space Model"
- (Lin et al., 2023) Lin et al., "Towards Flexibility and Interpretability of Gaussian Process State-Space Model"
- (Fan et al., 2023) Chang et al., "Free-Form Variational Inference for GPSSMs"
- (Lin et al., 2023, Lin et al., 24 Mar 2025) Lin et al., "Towards Efficient Modeling and Inference in Multi-Dimensional GPSSMs"; "Efficient Transformed GPSSMs for Non-Stationary High-Dimensional Systems"
- (Lin et al., 2023) Chen et al., "Ensemble Kalman Filtering Meets GPSSM for Non-Mean-Field and Online Inference"
- (Zheng et al., 22 Nov 2024) Lin et al., "Recursive Gaussian Process State Space Model"
- (Zheng et al., 17 Oct 2025) Lin et al., "Recursive Inference for Heterogeneous Multi-Output GPSSMs with Arbitrary Moment Matching"
- (Marco et al., 2023) Piga et al., "Out of Distribution Detection via Domain-Informed GPSSMs"
- (Särkkä, 2019) Gustafsson, "The Use of Gaussian Processes in System Identification"
These models form the methodological backbone for modern nonlinear system identification where robustness, flexibility, and uncertainty estimates are essential.