Weighted Extended DMD (wtEDMD) Overview

Updated 27 November 2025

wtEDMD is a data-driven method that replaces uniform ergodic averages with smooth weighted profiles, enhancing the accuracy and convergence of Koopman operator approximations.
It integrates specialized clustering and local weighting techniques to accurately estimate drift and diffusion in nonlinear stochastic and deterministic systems.
The algorithm exhibits faster convergence rates for periodic and quasiperiodic dynamics while maintaining robust performance in chaotic or noisy settings.

Weighted Extended Dynamic Mode Decomposition (wtEDMD) is a data-driven algorithm for operator approximation in dynamical systems that leverages weighted ergodic averages to suppress edge effects and accelerate convergence relative to standard Extended Dynamic Mode Decomposition (EDMD). In wtEDMD, smooth, vanishing-endpoint weight profiles replace uniform averaging, and specialized clustering and local weighting techniques enable robust performance even in nonlinear stochastic systems. The approach encompasses both finite-time Koopman operator approximations and Koopman generator estimation for stochastic differential equations, providing spectral analysis and system identification from limited or noisy data (Bou-Sakr-El-Tayar et al., 21 Nov 2025, Tahara et al., 26 Mar 2024).

1. Mathematical Foundations of Weighted Averages

wtEDMD modifies the classical ergodic Birkhoff average by introducing a smooth, non-uniform weight function. Given a measure-preserving ergodic system $(X,T,\mu)$ and observable $g\in L^1(\mu)$ , the standard average is

$B_N(g)(x) = \frac{1}{N}\sum_{n=0}^{N-1} g\left( T^n(x) \right).$

wtEDMD replaces this with

$WB_N(g)(x) = \frac{1}{\alpha_N} \sum_{n=0}^{N-1} w\left( \frac{n}{N} \right) g\left( T^n(x) \right), \quad \alpha_N = \sum_{n=0}^{N-1} w\left( \frac{n}{N} \right),$

where $w \in C^\infty([0,1])$ satisfies $\int_0^1 w = 1$ and $w^{(k)}(0) = w^{(k)}(1) = 0$ for all $k \ge 0$ (Bou-Sakr-El-Tayar et al., 21 Nov 2025). Common choices include bump functions $w(x) = C \exp\left(-1/(x(1-x))\right)$ .

For stochastic systems, wtEDMD leverages locally weighted expectation operators. Conditional averages approximating the drift and diffusion— $b(x)$ and $A(x)$ , respectively—are computed via kernels $w_H(x,x_n)$ defined by the local sample covariance (Tahara et al., 26 Mar 2024).

2. Algorithmic Structures and Recipes

wtEDMD for Koopman Operator

The algorithm proceeds as follows (Bou-Sakr-El-Tayar et al., 21 Nov 2025):

Snapshot Collection: Gather $N+1$ sequential snapshots $\{X_n\}$ .
Dictionary Construction: Define observable dictionaries $\vec{\phi} = (\phi_1,\ldots,\phi_R)$ , $\vec{\psi} = (\psi_1,\ldots,\psi_L)$ .
Weight Matrix: Build $W = \mathrm{diag}(w(0), w(1/N), \ldots, w(1)) \in \mathbb{R}^{N \times N}$ .
Weighted Least Squares: Solve

$\min_K \| W^{1/2}\Phi - W^{1/2}\Psi K \|_F$

yielding $K_w^{(N)} = (W^{1/2}\Psi)^{\dagger}(W^{1/2}\Phi)$ , where $\Psi$ and $\Phi$ are matrices of observables.

Spectral Analysis: $K_w^{(N)}$ serves as the projected Koopman operator.

wtEDMD for Koopman Generator (Stochastic SDEs)

For Itô SDEs, wtEDMD first computes locally weighted conditional moments (Tahara et al., 26 Mar 2024):

Estimate drift and diffusion at representative points via kernels $w_{H_k}(\cdot,\cdot)$ with cluster-specific bandwidths.
Construct derivative-observable pairs $d\psi(x^{(r)})$ at $M$ centroids $\{x^{(r)}\}$ .
Solve a weighted (optionally sparse) least-squares or lasso regression problem for the generator approximation matrix $L$ .

Pipeline Summary for Clustered wtEDMD

Step	Method	Purpose
Outlier removal	IsolationForest	Discard sparse/noisy points
Rep. selection	k-means clustering	Reduce to $M$ centroids
Local structure	Dirichlet Process Gaussian Mixture (DPMM)	Capture anisotropy for local weighting

3. Convergence Properties and Error Rates

Weighted Birkhoff averages enable faster convergence than uniform averaging, as edge effects from finite data are suppressed (Bou-Sakr-El-Tayar et al., 21 Nov 2025). Proven rates:

Periodic $T$ : Exponential decay in $N$ .
Quasiperiodic, smooth $g$ : Super-polynomial error $\mathcal{O}(N^{-m})$ for all $m$ .
Analytic $g$ , quasiperiodic orbits: Exponential decay.
Chaotic/stochastic: Empirical $\mathcal{O}(1/N)$ decay matches classical rates; wtEDMD does not degrade convergence.

Replacing the averages in EDMD with weighted averages retains asymptotic operator limits but accelerates practical convergence, especially for regular (periodic, quasiperiodic) dynamics (Bou-Sakr-El-Tayar et al., 21 Nov 2025). In stochastic settings, localized expectation operators filter high-frequency noise, yielding generator matrices more accurate than unweighted or naive finite-difference approaches (Tahara et al., 26 Mar 2024).

4. Practical Implementation and Numerical Examples

Key considerations for stable implementation (Bou-Sakr-El-Tayar et al., 21 Nov 2025, Tahara et al., 26 Mar 2024):

Weight functions: Prefer bump functions or modified signal-processing windows with vanishing endpoint derivatives.
Sample sizes: $N$ should exceed the dictionary sizes $L, R$ substantially for stability.
Conditioning: Monitor pseudocovariances; apply Tikhonov regularization or SVD-based pseudoinverse if ill-conditioned.
Window tuning: Aggressive tapering suppresses edge artifacts but reduces effective sample size near boundaries.

Representative Results

Example	Standard EDMD Error	wtEDMD Error / Convergence	Dynamics
Standard map (Fourier dict.)	$\sim 10^{-1}$	$\sim 10^{-4}$ ; superpoly.	Quasiperiodic
Standard map ( $\lambda=5$ )	$\sim \mathcal O(1/N)$	$\sim \mathcal O(1/N)$	Chaotic
Lid-driven cavity autocomp.	$10^{-2}$	$10^{-6}$	Periodic
El Niño forecast (RMSE)	Uniformly improved	$M=14$ modes optimal	Stochastic

In the El Niño diffusion forecast, weighted forecasting improved RMSE and correlation for leads up to 16 months, but excessive mode counts ( $M\gg 30$ ) with limited $N$ can degrade performance (Bou-Sakr-El-Tayar et al., 21 Nov 2025).

5. Limitations and Best Practices

Stochastic/chaotic systems: wtEDMD’s speedup is limited to $\mathcal O(1/N)$ , since ergodic averages' convergence is fundamentally statistical (Bou-Sakr-El-Tayar et al., 21 Nov 2025).
Noise/smoothness requirements: Strong tapers may amplify endpoint noise—apply milder windows or pre-filter data in such cases.
Dictionary size: Avoid overcomplete bases unless $N \gg L, R$ .
Weight tuning: Balance endpoint suppression and effective sample size; compromise with partial Tukey windows and monitor normalization constant $\alpha_N$ .
Numerical stability: SVD-based or regularized pseudoinverses are essential.

For stochastic systems, cluster-specific kernels (DPMM bandwidth matrices) outperform isotropic weighting, and two-stage clustering ensures representative diversity and local statistical consistency (Tahara et al., 26 Mar 2024).

6. Connections and Extensions

wtEDMD generalizes the weighted ergodic averaging principle to a range of operator-theoretic and identification algorithms, including wtDMD, wtSINDy, weighted spectral measure estimation, and diffusion forecasting (Bou-Sakr-El-Tayar et al., 21 Nov 2025). For generator estimation in SDEs, locally weighted expectations with clustering improve drift/diffusion reconstruction, outperforming classical regression and naive kernel methods (Tahara et al., 26 Mar 2024). Future work may address rigorously quantifying error propagation for chaotic systems and exploring adaptive weight profiles for systems with highly nonuniform sampling or strong state dependence.

The methodology is extensible to other data-driven frameworks relying on time-averaged statistics and is compatible with regularization and dimensionality reduction techniques. A plausible implication is that weighting and clustering can be systematically combined for improved system identification in the presence of non-uniform noise, anisotropic uncertainty, or limited samples.