Deep Spectral Learning of Embedded Latent Transfer Operators for Stochastic Dynamical Systems

Published 12 Jun 2026 in cs.LG | (2606.14079v1)

Abstract: We propose a spectral learning method for stochastic nonlinear dynamical systems represented with embedded latent transfer operators in deep feature spaces. We instantiate the method as Deep Spectral Encoder (DSE), an operator-based latent state-space model in which a time-invariant neural encoder implements learnable nonlinear feature maps from observations, and these features define Markovian latent states whose temporal evolution and observation mapping are described by the transfer and observation operators, respectively. Functional canonical correlation analysis in a learnable Galerkin-projected feature space provides state coordinates from past and future observations, and the two linear operators are estimated on the state coordinates as ridge-regularized closed-form solutions that coincide with Galerkin projections of the associated covariance operators. On this representation, we generalize sequential Bayesian filtering and Koopman spectral mode decomposition in feature space. Experiments on several scenarios show stable and superior performance with sequential Bayesian filtering and dynamic mode decomposition baselines even under noise and partial observability.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper presents a novel deep spectral learning framework that unifies operator theory with deep neural feature extraction for efficient latent state-space modeling.
It employs closed-form regression, canonical correlation analysis, and a staged training schedule to ensure numerical stability and robust sequential Bayesian filtering.
It extends Koopman mode decomposition and classical spectral methods to capture nonlinear dynamics and achieve superior eigenvalue estimation in noisy settings.

Deep Spectral Learning of Embedded Latent Transfer Operators for Stochastic Dynamical Systems

Introduction and Theoretical Motivation

This work introduces a novel spectral learning methodology for stochastic nonlinear dynamical systems based on operator-theoretic formulations in deep, finite-dimensional feature spaces. The framework leverages Hilbert space embeddings of marginal and conditional distributions, extending population-level operator formulations to data-driven, learnable finite-dimensional spaces using neural architectures. The methodology generalizes and unifies sequential Bayesian filtering (notably the Kalman filter and its nonlinear analogues) with spectral, operator-theoretic analysis, specifically targeting the estimation of embedded transfer and observation operators governing the evolution of latent Markov processes.

Classical state-space models depend critically on the choice of latent representation and require either parametric assumptions or complex inference over latent variables. In contrast, previous spectral learning methods (e.g., subspace identification, kernel-based stochastic realization) provide statistically robust, closed-form estimators but lack adaptive feature learning, limiting expressivity for high-dimensional, nonlinear, and partially observed settings. The proposed approach, termed Deep Spectral Learning (DSL), couples the representational flexibility of deep architectures with operator-based spectral learning, yielding operator-based latent state-space models with closed-form regression of dynamics and observation maps.

Operator Formulation of Latent Dynamical Systems

The foundation of the approach is an operator-level state-space model for discrete-time stochastic systems. The transition and observation mechanisms are modeled as conditional distributions, which, under the assumption of square-integrability, induce conditional covariance operators in Hilbert spaces. These operators, denoted $\mathcal{T}_e$ and $\mathcal{O}_e$ , yield population-level transfer and observation operators governing the evolution of mean embeddings of the latent and observed variables, respectively.

Figure 1: Overview of the operator-based state-space model, highlighting the flow of embedded distributions and corresponding latent transfer and observation operators.

The state mean embeddings evolve linearly: $\mu_{t+1} = \mathcal{T}_e \mu_t$ and $\mu_{y_t} = \mathcal{O}_e \mu_{x_t}$ . This operator framework supports non-parametric, closed-form regression of dynamics, and connects naturally to functional CCA for extracting predictive coordinates from observed data.

Stochastic Realization and Deep Feature Integration

The critical technical innovation is replacing classical (kernel-based or linear) feature spaces with explicit deep neural feature spaces. A time-invariant encoder constructs blockwise features from observations, forming delay-embedded inputs for functional CCA between past and future features, which yields canonical coordinates representing the minimal, maximally predictive subspace for latent Markov modeling.

This process defines Markovian latent coordinates that are statistically optimal in the feature space, generalizing classical stochastic realization (Akaike’s balanced stochastic realization) and extending functional CCA to learned Galerkin subspaces. The operator regression stage, building on these coordinates, employs closed-form ridge regression solutions akin to Galerkin projections, ensuring numerical stability and theoretical consistency.

Deep Spectral Encoder: Architectural Realization

The approach is instantiated in the Deep Spectral Encoder (DSE) architecture. The pipeline consists of: (1) feature extraction from the observation sequence via a deep encoder, (2) block-feature construction for CCA-based latent state estimation, (3) closed-form operator regression in the CCA-reduced latent subspace using learnable state and observation dictionaries, and (4) a modular training paradigm alternating between representation and operator parameter updates to prevent degenerate solutions.

Figure 2: The DSE architecture, showing modular separation between feature encoding, stochastic realization, operator regression, and decoding modules.

Key to stability is the staged training schedule: the encoder and decoder are first frozen while operator and readout map parameters are optimized; after stabilization, end-to-end fine-tuning is performed along the test-time prediction path. Cross-fitting over time-segmented blocks mitigates information leakage, further enforcing time-wise causal constraints.

Sequential State Estimation: Kalman Filtering in Learned Feature Space

DSE supports Kalman-style sequential Bayesian state estimation entirely within the learned feature space. Using the Galerkin-projected operator matrices for dynamics and observation models, the recursive filter propagates first and second moments of the latent state embeddings by closed-form operator updates. Empirical estimates of process and observation noise covariances are computed from residuals, capturing approximation or modeling errors inherent to the finite-dimensional learned operators.

Numerical experiments on the quad-link pendulum with raw image observations (both clean and heavily corrupted regimes) demonstrate robust filtering and prediction under both limited and extended training horizons.

Mode Decomposition and Koopman Spectral Recovery

In addition to sequential state estimation, the framework implements Koopman mode decomposition by eigendecomposition of the learned transfer operator. This provides estimates of spectral modes and frequencies governing nonlinear system evolution, even under strong noise and partial observability. The methodology thus generalizes classical Dynamic Mode Decomposition (DMD) and kernel-based estimators to deep-learned feature spaces, with empirical performance exemplifying statistically efficient and accurate recovery in both the Van der Pol (VDP) and Stuart-Landau (SL) oscillator benchmarks.

Figure 3: Koopman spectral recovery results for the Van der Pol oscillator, showing improved spectrum estimation under noise across methods.

Figure 4: Koopman spectral recovery results for the Stuart-Landau oscillator, highlighting robustness to process noise and stability of DSE compared to DMD baselines.

Quantitative results indicate that DSE produces the lowest mean error in eigenvalue estimation across a wide noise range, consistently outperforming established baselines including sDMD, Hankel-DMD, and ELTO on both oscillator problems.

Empirical Results and Ablation

Comprehensive ablation studies confirm the necessity of the staged training schedule, CCA-based latent coordinate construction, and closed-form operator estimation. The modular approach avoids degenerate solutions observed under end-to-end joint training. Numerical stability and sample efficiency are achieved through Galerkin projection, cross-fitted regression, and systematic regularization, supporting the reliability and reproducibility of both prediction and spectral recovery tasks.

Implications and Future Directions

The DSL and DSE framework unifies operator-theoretic modeling with deep, flexibly learned representations in time-series analysis. The approach retains the statistical properties associated with spectral state-space methods—such as closed-form estimation, interpretable latent dynamics, and connection to canonical coordinates—while augmenting practical expressivity and robustness through explicit nonlinear neural features. This coalescence opens new methodological frontiers for nonlinear forecasting, system identification, and spectrum analysis, particularly in high-dimensional, noisy, or partially observed regimes typical of modern applications.

Potential avenues for future work include scaling the approach to very high-dimensional observation spaces, integration with control-theoretic policies, data-efficient reinforcement learning, explicit uncertainty quantification for Bayesian inference, and extension to continuous-time or spatiotemporal systems. The theoretical framework also motivates further analysis of representation identifiability, sample complexity, and regularization strategies in operator-based deep latent models.

Conclusion

This work presents a principled, operator-based framework for latent state-space modeling and spectral learning using deep feature spaces. The modular Deep Spectral Encoder architecture integrates CCA-based stochastic realization, closed-form operator regression, and feature-space Bayesian filtering, demonstrating robust performance in settings with nonlinear dynamics, noise, and partial observability. Both theoretical and empirical analyses substantiate the efficacy and stability of the approach, establishing DSE as a foundation for further advances in operator-theoretic and deep learning for stochastic dynamical systems.

Markdown Report Issue