ERDMD: Entropic Regression for Delay Selection

Updated 15 February 2026

The paper’s main contribution is integrating entropic regression with higher-order DMD to autonomously select sparse delays, reducing overfitting.
It employs conditional mutual information in iterative build and prune steps to identify the most informative lag structures for dynamic reconstruction.
Empirical results on systems like Lorenz-63 and Rössler demonstrate enhanced multiscale feature extraction and high-fidelity model reconstruction.

Entropic Regression for Delay Selection (ERDMD) is a methodology for constructing dynamic mode decomposition (DMD) models that leverages conditional mutual information to select a sparse and nonuniform set of time delays. By integrating entropic regression into the higher-order DMD (HODMD) paradigm, ERDMD discovers highly informative lag structures, facilitating both model interpretability and efficient multiscale dynamical reconstruction. The technique is characterized by iterative greedy and pruning steps based on information flow, resulting in robust, high-fidelity models with minimal overfitting, especially for complex systems exhibiting multiscale behaviors (Curtis et al., 2024).

1. Foundations: DMD and Higher-Order DMD

Standard DMD seeks the best-fit linear operator $A \in \mathbb{R}^{m \times m}$ mapping state snapshots $X$ to $X'$ in a time-ordered sequence $\{x(t_1), ..., x(t_N)\} \subset \mathbb{R}^m$ , sampled uniformly in time. The optimal $A$ minimizes the Frobenius norm:

$X' \approx A X, \quad A = X' X^\dagger,$

where $X^\dagger$ is the Moore–Penrose pseudoinverse. DMD modes and spectral information are given by the eigen-decomposition $A\phi = \lambda\phi$ .

HODMD generalizes this framework by permitting prediction from a linear combination of multiple previous states (time-lags $\ell_k$ ):

$x(t_{j+1}) \approx \sum_{k=1}^r K_{\ell_k} x(t_{j+1-\ell_k}),$

where each $K_{\ell_k}$ is found by solving a block least-squares problem over lag-selected matrices. This allows for richer, lagged models but typically requires selection among many possible lags up to some maximum $d$ .

2. Information-Theoretic Variable Selection via Entropic Regression

Entropic regression (ER) introduces an information-theoretic approach for model variable selection. For discrete random variables $U,V$ with entropy $H(U)$ and joint entropy $H(U,V)$ , the mutual information $I(U;V)$ quantifies shared information, and the conditional mutual information $I(Y;X|Z)$ measures information flow from $X$ to $Y$ given knowledge of $Z$ .

In ER, the BUILD step identifies candidate predictors that offer the largest information gain:

$\Delta I_j = I(Y_s; M_t^{(j)} | M_c) = H(Y_s|M_c) - H(Y_s|M_c, M_t^{(j)}),$

where $Y_s$ is the target (e.g., future snapshots $Y_+$ ), $M_c$ is the current model, and $M_t^{(j)}$ is a candidate addition. A predictor is retained only if $\Delta I_j$ is significantly positive, as validated via a shuffle-test. The PRUNE step iteratively removes predictors whose loss in mutual information is negligibly positive:

$\delta I_j = I(Y_s; M_c \setminus \{j\} | M_c).$

3. ERDMD Algorithm: Delay Discovery and Model Construction

ERDMD operationalizes ER for optimal delay selection in lagged DMD models. The methodology proceeds as follows:

Initialization: Start with $\ell_c = \{1\}$ (the minimal, first-delay model) and candidate pool $\ell_r = \{2,3, ..., d\}$ .
Build Step: For each candidate $\ell \in \ell_r$ $ℓ \in ℓ_{r}$ :
1. Add $\ell$ to the existing delay set, forming $\ell_t = \ell_c \cup \{\ell\}$ .
2. Solve the least-squares problem for $K(\ell_t)$ :
$K(\ell_t) = \arg\min_K \| Y_+ - K Y_-(\ell_t) \|_F$

Compute the information gain $\Delta I(\ell)$ . The candidate with maximal $\Delta I$ is added if its value passes the threshold $\gamma$ .

Prune Step: For each delay $\ell$ in the current set (except the initial delay), remove $\ell$ if the loss in information $\delta I(\ell)$ is less than $\gamma$ .

This iterative build/prune procedure yields a sparse, nonuniform set of lags $\ell_c = \{\ell_1, ..., \ell_r\}$ , resulting in compact, highly informative models.

4. Computational Aspects and Sparsity Enforcement

At each selection or pruning operation, the block matrix $K(\ell_t)$ is computed by standard least-squares regression:

$K(\ell_t) = Y_+ Y_-(\ell_t)^T [Y_-(\ell_t) Y_-(\ell_t)^T]^{-1}$

Sparsity is strictly maintained by restricting the model to the currently selected delays, and no additional $\ell_j$ are introduced outside this set. Regularization can be employed by adding an $\ell_2$ (ridge) penalty:

$\arg \min_K \|Y_+ - K Y_-\|_F^2 + \alpha \|K\|_F^2,$

or by thresholding blocks in $K_{ℓ_k}$ below a set Frobenius norm.

5. Empirical Evaluation and Benchmark Scenarios

ERDMD has been benchmarked on a range of canonical nonlinear dynamical systems, highlighting its compactness and multiscale feature extraction:

System	ERDMD Selected Delays	Full HODMD Delays	Notes
Lorenz-63 (dt=0.01, d=150)	$\{1, 149\}$	$1 \leq \ell \leq 150$	ERDMD uses 2 delays, slight increase in error, less overfitting
Rössler (dt=0.01, d=1000)	$\{1, 3, 170, 436, 553, 665, 988\}$	$1 \leq \ell \leq 1000$	Captures fast/slow scales, spectral analysis reveals timescale separation
Kuramoto–Sivashinsky (d=200)	$\{1, 123, 141, 158\}$	$1 \leq \ell \leq 200$	4 delays suffice for high-fidelity, 12D-POD trajectory

The reconstruction error metric is:

$E = \sqrt{ \frac{1}{N-d+1} \sum_{j=d}^{N} \| x(t_j) - \hat{x}(t_j) \|^2 }$

In all cases, error $E$ decays rapidly as $r$ (number of delays) increases, indicating that only a small subset of delays is necessary for high-accuracy reconstruction.

6. Interpretive Context and Implications

ERDMD synthesizes DMD's linear modeling capacity with an information-theoretic approach to lag selection. By prioritizing delays with maximal conditional information gain, the methodology efficiently identifies and encodes the most significant temporal dependencies without resorting to uniform lag sweeps. This suggests improved interpretability and robustness, as evidenced by reduced overfitting outside the training window in chaotic benchmarks. Furthermore, the observation that nonuniform (potentially widely separated) delays are frequently selected indicates sensitivity to multiscale and intermittent dependencies in complex dynamical systems (Curtis et al., 2024). A plausible implication is that ERDMD not only streamlines delay-structure discovery but may facilitate spectral analyses aligned with distinct dynamical timescales, as observed in the ring structures of eigenvalues.

7. Conclusion

ERDMD combines multistep DMD with entropic regression to autonomously discover a compact, informative set of nonuniform time delays for lagged dynamical modeling. Each variable selection step is grounded in Frobenius-norm least-squares augmented by conditional mutual information criteria. The resulting models achieve robust, interpretable, and spectrally enriched delay-DMD representations, validated across diverse nonlinear systems with significant reduction in model complexity and maintained or improved reconstruction fidelity (Curtis et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

Entropic Regression DMD (ERDMD) Discovers Informative Sparse and Nonuniformly Time Delayed Models (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Entropic Regression for Delay Selection (ERDMD).

ERDMD: Entropic Regression for Delay Selection

1. Foundations: DMD and Higher-Order DMD

2. Information-Theoretic Variable Selection via Entropic Regression

3. ERDMD Algorithm: Delay Discovery and Model Construction

4. Computational Aspects and Sparsity Enforcement

5. Empirical Evaluation and Benchmark Scenarios

6. Interpretive Context and Implications

7. Conclusion

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

ERDMD: Entropic Regression for Delay Selection

1. Foundations: DMD and Higher-Order DMD

2. Information-Theoretic Variable Selection via Entropic Regression

3. ERDMD Algorithm: Delay Discovery and Model Construction

4. Computational Aspects and Sparsity Enforcement

5. Empirical Evaluation and Benchmark Scenarios

6. Interpretive Context and Implications

7. Conclusion

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research