LSTM-Enhanced Deep Koopman Model

Updated 13 November 2025

The paper introduces a dictionary-free LSTM-enhanced deep Koopman model that learns linear latent representations to capture nonlinear dynamics with input delays.
The model integrates LSTM-based encoding and meta-learning to robustly handle history-dependent delays and short time-series data.
Empirical benchmarks show reduced prediction errors compared to classical eDMD, achieving performance close to fully-informed models.

The LSTM-Enhanced Deep Koopman Model is a neural architecture for learning linear representations of nonlinear dynamical systems that exhibit input delays or are defined by short time series. Rather than relying on manually engineered dictionaries as in classic extended Dynamic Mode Decomposition (eDMD), these models leverage deep learning components—most notably Long Short-Term Memory (LSTM) layers—to encode delayed inputs or time-series properties directly into a compact latent space where the evolution of the system admits a linear dynamics governed by an approximate Koopman operator. This facilitates prediction, control, and spectral analysis of systems with unknown or complex nonlinearities.

1. Mathematical Foundations of Koopman Operator Approximation

For a general discrete-time nonlinear control system $x_{k+1}=F(x_k,u_k)$ with $x_k\in\mathbb{R}^n$ and $u_k\in\mathbb{R}^m$ , the Koopman operator $\mathcal{K}$ acts linearly on observables: $(\mathcal{K}\phi)(x_k,u_k)=\phi(F(x_k,u_k))$ . Since $\mathcal{K}$ is infinite-dimensional, practical applications seek a finite-dimensional approximation by learning mappings $g: (x_k,h_k)\mapsto z_k\in\mathbb{R}^d$ and $g^{-1}: z_k\mapsto x_k$ , as well as matrices $A_\mathcal{K}\in\mathbb{R}^{d\times d}$ and $B_\mathcal{K}\in\mathbb{R}^{d\times m}$ so that

$z_{k+1} \approx A_\mathcal{K}z_k + B_\mathcal{K}u_k.$

The identification task thus reduces to minimizing reconstruction and prediction errors in this lifted space with respect to the Frobenius norm over dataset snapshots:

$\min_{A_\mathcal{K},B_\mathcal{K}}\ \sum_{k=0}^{M-1} \| z_{k+1} - (A_\mathcal{K}z_k + B_\mathcal{K}u_k) \|_F^2.$

This approach accommodates input delays by defining the lifted state $z_k$ as a function of both the instantaneous state $x_k$ and an encoded history $h_k$ .

2. LSTM-Based Encoding for Input Delay and Short Series Dynamics

Input delay and the effect of incomplete state observations are addressed by embedding history windows as fixed-length vectors using an LSTM architecture. For length- $\eta_H$ history $H_k$ comprising prior states and inputs, the LSTM recursively updates its hidden and cell states: \begin{align*} f_t &= \sigma(W_fH_{k,t} + U_fh_{t-1} + b_f) \ i_t &= \sigma(W_iH_{k,t} + U_ih_{t-1} + b_i) \ o_t &= \sigma(W_oH_{k,t} + U_oh_{t-1} + b_o) \ \tilde c_t &= \tanh(W_cH_{k,t} + U_ch_{t-1} + b_c) \ c_t &= f_t\odot c_{t-1} + i_t\odot \tilde c_t \ h_t &= o_t\odot \tanh(c_t) \end{align*} After completing the history window, $h_k\equiv h_{\eta_H}$ encodes all delayed and historical effects. In dictionary-free Koopman models, this encoding is concatenated with the current state before being passed to the encoder network that yields the lifted representation $z_k$ .

In meta-learning extensions for short time-series (as in (Iwata et al., 2021)), a bidirectional LSTM is used to produce a task-dependent representation $z$ from the support sequence, and all embedding and decoding networks are conditioned on $z$ so as to adapt the Koopman space uniquely for each series.

3. Dictionary-Free Network Architectures

The architecture centers on three principal blocks:

LSTM block: Encodes historical state and input information into $h_k$ .
Encoder block: Processes $[x_k; h_k]$ through two fully connected layers (dimension e.g. 10 → 60 → 40) to produce $z_k$ .
Linear Koopman dynamics: Evolves $z_k$ via learned matrices: $A_\mathcal{K}\in\mathbb{R}^{40\times40}$ and $B_\mathcal{K}\in\mathbb{R}^{40\times1}$ for the benchmark problem.
Decoder block: Recovers the predicted measurement $x_{k+1}$ from the lifted $z_{k+1}$ via two mirrored fully connected layers.

No external dictionary of nonlinear observables is required; the lifting and decoding networks learn basis functions end-to-end, resulting in a dictionary-free realization. In meta-learning models for short time-series, the encoder and decoder are further conditioned on the bidirectional LSTM-derived representation $z$ .

4. Training Objectives, Losses, and Meta-Learning

The end-to-end loss for the dictionary-free deep Koopman model combines the following components:

Reconstruction loss: $L_{rec} = \| x_k - g^{-1}(g(x_k, h_k)) \|_2^2$
One-step prediction loss: $L_{step} = \| x_{k+1} - g^{-1}(A_\mathcal{K}z_k + B_\mathcal{K}u_k) \|_2^2$
Multi-step output prediction loss over horizon $N_L$ : $L_{pred} = \sum_{i=1}^{N_L} \| x_{k+i} - g^{-1}(\hat{z}_{k+i}) \|_2^2$
Latent trajectory loss: $L_{lpred} = \sum_{i=1}^{N_L} \| z_{k+i} - \hat{z}_{k+i} \|_2^2$

These are combined with task-relevant weights to yield the overall objective:

$L = w_{rec} L_{rec} + w_{step} L_{step} + w_{pred} L_{pred} + w_{lpred} L_{lpred}.$

In meta-learning models for short series, the objective is the average prediction loss over episodically sampled tasks. System parameters (e.g., hidden and output dimensions, LSTM size, learning rate, batch size) are tuned for convergence and generalization.

5. Algorithmic Implementation and Hyperparameter Choices

For systems with input delay, such as the two-tank water system \begin{align*} dh_1/dt &= q(t-\tau) - (k_1/F_1)\sqrt{h_1},\ dh_2/dt &= (k_1/F_2)\sqrt{h_1} - (k_2/F_2)\sqrt{h_2}, \end{align*} data are simulated with parameters (sampling $T_s=10$ \,s, delay $\tau=20T_s$ , noise $\sigma=0.1$ ), generating $N=4\times 10^5$ samples. The data is split evenly for training and testing.

During training, each input window $H_k$ is processed by the LSTM to produce $h_k$ , concatenated with $x_k$ for encoding, and used to optimize the fourfold loss $L$ with Adam ( $\mathrm{lr}=0.001$ , batch size $=100$ , epochs $=1500$ ).

For meta-learning applications to short time-series, datasets include synthetic, Van-der-Pol, Lorenz, and cylinder-wake systems. The bidirectional LSTM hidden size is $K=32$ , with MLPs (four layers of $128$ units) used for both embedding and decoding. Training proceeds for up to $10,000$ epochs.

6. Enforcement of Linear Latent Evolution and Koopman Spectral Analysis

Linear evolution in latent space is explicitly enforced via the parameterized matrices $A_\mathcal{K}$ and $B_\mathcal{K}$ :

$z_{k+t} = A_\mathcal{K}^t\, z_k + \sum_{i=0}^{t-1} A_\mathcal{K}^{t-1-i} B_\mathcal{K} u_{k+i}.$

Minimizing multi-step losses $L_{pred}$ and $L_{lpred}$ constrains the learned embedding to reside near an invariant subspace of the Koopman operator. The eigenvalues $\lambda_j$ from the spectral decomposition $A_\mathcal{K} v_j = \lambda_j v_j$ approximate modes of growth, decay, and oscillation in the original system.

In meta-learning models, the Koopman matrix $K(z)$ is estimated by least-squares from embedded time-series via the Moore–Penrose pseudoinverse. Conditioning on the LSTM-derived representation enables the model to adapt its encoding and decoding functions dynamically per time-series, supporting robust spectral and future prediction even for short data.

7. Empirical Benchmarks and Performance Assessment

Empirical evaluation uses mean absolute error (MAE) on prediction of system states as the main metric. For the two-tank water system, results are:

Model	MAE [m]	relative [%]
eDMD (true dynamics in dictionary)	0.175	100
LSTM-enhanced Deep Koopman	0.185	106
eDMD (unknown true nonlinearity)	0.606	346

When true nonlinearities are unknown (the typical real-world scenario), the LSTM-enhanced Deep Koopman model reduces MAE by a factor of $\sim$ 3 compared to eDMD without the correct dictionary. Compared to eDMD with full prior knowledge, it incurs only a 6% loss in accuracy, while remaining completely dictionary-free and utilizing a more compact 40-dimensional embedding.

Meta-learning LSTM-enhanced models for short time-series (Iwata et al., 2021) demonstrate superior eigenvalue estimation and future prediction RMSE across synthetic, chaotic, and fluid dynamics datasets, with ablation studies showing significant performance declines when bidirectional LSTM representations or meta-episodic training are omitted.

Eigenvalue spectra from the learned $A_\mathcal{K}$ matrices closely match those of the true system, validating the efficacy of the dictionary-free lifting. In practice, models require on the order of $10$–$50$ timesteps for accurate adaptation to each new series and enable immediate spectral analysis and prediction without further fine-tuning.

8. Context and Significance

The LSTM-Enhanced Deep Koopman Model addresses two major limitations in Koopman-based modeling:

The need for expert-designed, system-specific dictionaries in eDMD, which are often infeasible when system nonlinearities are unknown.
The fragility of neural Koopman methods in applications with input delays or limited data, where history-dependent representations and meta-learning conditionals are essential.

A plausible implication is that the theory can be extended to noisy real-world systems or online learning by adjusting the LSTM history or meta-learning structure. The architecture’s capacity for compact, accurate, and adaptive linearization of nonlinear, delayed, or short time-series systems has implications for control, prediction, and spectral analysis across dynamical systems, robotics, and scientific computing.

PDF Markdown Chat (Pro)

References (1)

Meta-Learning for Koopman Spectral Analysis with Short Time-series (2021)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to LSTM-Enhanced Deep Koopman Model.