Papers
Topics
Authors
Recent
2000 character limit reached

Conformal Online Koopman Learning (COLoKe)

Updated 23 November 2025
  • The paper introduces a conformal update mechanism that selectively refines model parameters based on deviations in multi-step prediction accuracy.
  • It integrates deep neural network–based embeddings with a temporal consistency loss to lift nonlinear dynamics into a linear, high-dimensional space.
  • Empirical results on synthetic and real-world benchmarks demonstrate orders-of-magnitude improvements in prediction accuracy with reduced update frequency.

Conformal Online Learning of Koopman Embeddings (COLoKe) is a framework for adaptively learning Koopman-invariant representations of nonlinear dynamical systems from streaming data. By integrating deep neural network–based embeddings with a conformal-style, event-triggered control mechanism, COLoKe ensures that the learning of both the lifting map and the linear Koopman operator is performed both efficiently and with explicit regularization to avoid overfitting. The critical innovation is a selective update scheme: model parameters are refined only when the predicted conformity of the current Koopman model falls below a dynamically calibrated threshold, as measured over a rolling temporal window. This enables sustained predictive accuracy with reduced (and adaptive) update frequency while maintaining provable dynamic-regret guarantees and minimizing unnecessary computation (Gao et al., 16 Nov 2025).

1. Model Architecture and Koopman Embedding

COLoKe lifts each raw system state xRdx\in\mathbb{R}^d into a higher-dimensional complex vector via

Φθ(x)=[x Φ~θ(x)]Cm,\Phi_\theta(x) = \begin{bmatrix}x \ \tilde\Phi_\theta(x)\end{bmatrix} \in \mathbb{C}^m,

where Φ~θ:RdCmd\tilde\Phi_\theta:\mathbb{R}^d \to \mathbb{C}^{m-d} is parameterized by a neural network with trainable weights θ\theta. The explicit inclusion of the identity component xx in the output enforces a built-in reconstruction constraint, ensuring that original system states are preserved in the embedding without a separate decoder.

In this lifted space, COLoKe seeks a Koopman operator KCm×mK \in \mathbb{C}^{m \times m} such that the dynamics propagate approximately linearly: Φθ(xt+1)KΦθ(xt),(one-step)\Phi_\theta(x_{t+1}) \approx K \Phi_\theta(x_t),\qquad (\text{one-step}) and for multistep prediction: Φθ(xt+h)KhΦθ(xt),h1.\Phi_\theta(x_{t+h}) \approx K^h \Phi_\theta(x_t),\qquad h \ge 1. This approach generalizes standard deep Koopman operator learning to the online setting, where both θ\theta and KK are updated as new data streams in.

2. Multistep Temporal Consistency and Loss Formulation

To enforce temporal consistency, COLoKe maintains a buffer Dt={xtw,...,xt}\mathcal{D}_t = \{x_{t-w},...,x_t\} of the latest w+1w+1 states. The multi-step consistency loss at time tt is defined by

Lt(θ,K)=(s,τ)Itj=1τΦθ(xs+τ)KjΦθ(xs+τj)2,\mathcal{L}_t(\theta, K) = \sum_{(s,\tau)\in\mathcal{I}_t} \sum_{j=1}^\tau \Big\| \Phi_\theta(x_{s+\tau}) - K^j \Phi_\theta(x_{s+\tau-j}) \Big\|^2,

where It={(s,τ)tws<s+τt}\mathcal{I}_t = \{ (s, \tau) \mid t-w \leq s < s+\tau \leq t \}. This loss enforces that the learned embedding and operator jointly produce consistent multi-step predictions across all temporally local subwindows.

At every time tt, the conformity of the current model is assessed using the prediction error at the buffer's end: st=tw,w(θt,Kt)=τ=1wΦθt(xt)KtτΦθt(xtτ)2.s_t = \ell_{t-w, w}(\theta_t, K_t) = \sum_{\tau=1}^w \big\| \Phi_{\theta_t}(x_t) - K_t^\tau \Phi_{\theta_t}(x_{t-\tau}) \big\|^2. This scalar score sts_t represents the discrepancy between multi-step forward predictions and actual states, serving as the central criterion for triggering model updates.

3. Conformal-Style Update Trigger and Threshold Control

Rather than updating model weights at every time step, COLoKe utilizes a conformal-style mechanism that adaptively controls when updates occur. A nonconformity threshold qt>0q_t > 0 is maintained and adjusted by a proportional–integral (PI) control law: qt+1=qt+γ(etα)+rti=1t(eiα),q_{t+1} = q_t + \gamma (e_t - \alpha) + r_t\sum_{i=1}^t (e_i - \alpha), where et=1{st>qt}e_t = \mathbf{1}\{ s_t > q_t \} is an indicator for excess prediction error, α[0,1]\alpha \in [0,1] is the target nonconformity rate, γ>0\gamma > 0 is a step-size parameter, and rtr_t is a saturation constant for the integral term to stabilize control.

Model updates are triggered only if st>qts_t > q_t. When this occurs, gradient descent steps are performed on Lt\mathcal{L}_t (over both θ\theta and KK) until stqts_t \leq q_t. This selective adaptation provides strong regularization: updates are explicit responses to statistically significant model misfit, rather than being performed indiscriminately.

The update rule can be summarized as:

  • If stqts_t \leq q_t: set (θt+1,Kt+1)=(θt,Kt)(\theta_{t+1}, K_{t+1}) = (\theta_t, K_t);
  • Else: perform gradient steps on Lt\mathcal{L}_t until stqts_t \leq q_t.

A key property is that updating terminates as soon as the conformity test is satisfied, which acts as a natural safeguard against overfitting.

4. Online Algorithm Workflow

A concise description of the learning workflow is as follows:

  1. Initialize embedding parameters θw1\theta_{w-1} and operator Kw1K_{w-1} via fitting on the first w+1w+1 data points. Set initial qwq_w to the (1α)(1-\alpha)-quantile of the initial conformity scores.
  2. For each new time step twt \geq w:
    • Observe xtx_t and update buffer Dt\mathcal{D}_t;
    • Copy (θt,Kt)(θt1,Kt1)(\theta_t, K_t) \leftarrow (\theta_{t-1}, K_{t-1});
    • Compute conformity score st=tw,w(θt,Kt)s_t = \ell_{t-w,w}(\theta_t, K_t);
    • Update the threshold qt+1q_{t+1} using the PI control law;
    • If st>qts_t > q_t, perform gradient steps on Lt\mathcal{L}_t and recompute sts_t until stqts_t \leq q_t;
    • Advance to the next time step.

Stopping update iterations as soon as conformity is restored avoids over-minimization, further supporting generalization.

5. Empirical Evaluation and Performance

COLoKe was benchmarked on classical synthetic dynamical systems—single attractor systems (with analytic spectra), the Duffing oscillator (mixed spirals and saddle points), the Van der Pol oscillator (limit cycle), and the Lorenz system (chaotic attractor)—as well as real-world time series (electricity transformer, EEG, atmospheric turbulence) (Gao et al., 16 Nov 2025).

The primary evaluation metrics used are:

  • Generalization error: one-step MSE on held-out trajectories;
  • Online prediction error: average MSE as streaming data is processed;
  • Update frequency: the fraction of steps where st>qts_t > q_t.

Across these benchmarks, COLoKe consistently achieves:

  • The lowest prediction errors, with orders-of-magnitude improvement on the Lorenz system;
  • Approximately 50% update frequency, representing significantly fewer updates than fixed-step baselines (which update 100% of steps);
  • Accurate recovery of Koopman eigenvalues and eigenfunctions;
  • Reduced wall-clock training time compared to offline deep Koopman approaches.

A plausible implication is that conformal-style selective updating provides a favorable tradeoff between prediction accuracy and computational effort in online dynamical system identification.

6. Hyperparameter Choices and Sensitivity

The main hyperparameters and their effects are summarized below:

Hyperparameter Typical Range Effect
Window size ww 10–50 Controls horizon of multistep loss; larger ww enforces greater temporal consistency but increases per-step computation
Conformal PI α\alpha [0,1][0,1] High α\alpha: higher accuracy, more updates; low α\alpha: fewer updates, possible degradation in fit
PI step γ\gamma 0.1\sim 0.1 Adjusts agility of threshold adaptation
I-term saturation bounded by CsatC_{sat} Ensures stability of PI controller
Learning rate η\eta e.g., 10310^{-3} Sets step size in triggered gradient updates
Embedding dimension mm d+d/2d+\lceil d/2 \rceil Higher mm improves expressivity at the cost of parameter count

These settings enable COLoKe to adapt to both fast-changing and stable regimes, maintain computational efficiency, and control overfitting via principled, data-dependent update rules.

7. Principles and Broader Relevance

COLoKe exemplifies an approach where learning of deep operator-theoretic models is embedded within a conformal-style supervisory control loop. This separates the questions of how to learn (via deep neural parameterization and multistep temporal consistency) from when to update (decided by online conformity scores relative to a statistically motivated, dynamically calibrated threshold). The method yields provably efficient, online adaptive learning with dynamic-regret guarantees and avoids pitfalls of overfitting endemic to naive online retraining. Demonstrated empirical success on synthetic and real benchmarks highlights its practicality for streaming and nonstationary system identification scenarios (Gao et al., 16 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Conformal Online Learning of Koopman Embeddings (COLoKe).

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube