ERDMD: Entropic Regression for Delay Selection
- The paper’s main contribution is integrating entropic regression with higher-order DMD to autonomously select sparse delays, reducing overfitting.
- It employs conditional mutual information in iterative build and prune steps to identify the most informative lag structures for dynamic reconstruction.
- Empirical results on systems like Lorenz-63 and Rössler demonstrate enhanced multiscale feature extraction and high-fidelity model reconstruction.
Entropic Regression for Delay Selection (ERDMD) is a methodology for constructing dynamic mode decomposition (DMD) models that leverages conditional mutual information to select a sparse and nonuniform set of time delays. By integrating entropic regression into the higher-order DMD (HODMD) paradigm, ERDMD discovers highly informative lag structures, facilitating both model interpretability and efficient multiscale dynamical reconstruction. The technique is characterized by iterative greedy and pruning steps based on information flow, resulting in robust, high-fidelity models with minimal overfitting, especially for complex systems exhibiting multiscale behaviors (Curtis et al., 2024).
1. Foundations: DMD and Higher-Order DMD
Standard DMD seeks the best-fit linear operator mapping state snapshots to in a time-ordered sequence , sampled uniformly in time. The optimal minimizes the Frobenius norm:
where is the Moore–Penrose pseudoinverse. DMD modes and spectral information are given by the eigen-decomposition .
HODMD generalizes this framework by permitting prediction from a linear combination of multiple previous states (time-lags ):
where each is found by solving a block least-squares problem over lag-selected matrices. This allows for richer, lagged models but typically requires selection among many possible lags up to some maximum .
2. Information-Theoretic Variable Selection via Entropic Regression
Entropic regression (ER) introduces an information-theoretic approach for model variable selection. For discrete random variables with entropy and joint entropy , the mutual information quantifies shared information, and the conditional mutual information measures information flow from to given knowledge of .
In ER, the BUILD step identifies candidate predictors that offer the largest information gain:
where is the target (e.g., future snapshots ), is the current model, and is a candidate addition. A predictor is retained only if is significantly positive, as validated via a shuffle-test. The PRUNE step iteratively removes predictors whose loss in mutual information is negligibly positive:
3. ERDMD Algorithm: Delay Discovery and Model Construction
ERDMD operationalizes ER for optimal delay selection in lagged DMD models. The methodology proceeds as follows:
- Initialization: Start with (the minimal, first-delay model) and candidate pool .
- Build Step: For each candidate :
- Add to the existing delay set, forming .
- Solve the least-squares problem for :
- Compute the information gain . The candidate with maximal is added if its value passes the threshold .
- Prune Step: For each delay in the current set (except the initial delay), remove if the loss in information is less than .
This iterative build/prune procedure yields a sparse, nonuniform set of lags , resulting in compact, highly informative models.
4. Computational Aspects and Sparsity Enforcement
At each selection or pruning operation, the block matrix is computed by standard least-squares regression:
Sparsity is strictly maintained by restricting the model to the currently selected delays, and no additional are introduced outside this set. Regularization can be employed by adding an (ridge) penalty:
or by thresholding blocks in below a set Frobenius norm.
5. Empirical Evaluation and Benchmark Scenarios
ERDMD has been benchmarked on a range of canonical nonlinear dynamical systems, highlighting its compactness and multiscale feature extraction:
| System | ERDMD Selected Delays | Full HODMD Delays | Notes |
|---|---|---|---|
| Lorenz-63 (dt=0.01, d=150) | ERDMD uses 2 delays, slight increase in error, less overfitting | ||
| Rössler (dt=0.01, d=1000) | Captures fast/slow scales, spectral analysis reveals timescale separation | ||
| Kuramoto–Sivashinsky (d=200) | 4 delays suffice for high-fidelity, 12D-POD trajectory |
The reconstruction error metric is:
In all cases, error decays rapidly as (number of delays) increases, indicating that only a small subset of delays is necessary for high-accuracy reconstruction.
6. Interpretive Context and Implications
ERDMD synthesizes DMD's linear modeling capacity with an information-theoretic approach to lag selection. By prioritizing delays with maximal conditional information gain, the methodology efficiently identifies and encodes the most significant temporal dependencies without resorting to uniform lag sweeps. This suggests improved interpretability and robustness, as evidenced by reduced overfitting outside the training window in chaotic benchmarks. Furthermore, the observation that nonuniform (potentially widely separated) delays are frequently selected indicates sensitivity to multiscale and intermittent dependencies in complex dynamical systems (Curtis et al., 2024). A plausible implication is that ERDMD not only streamlines delay-structure discovery but may facilitate spectral analyses aligned with distinct dynamical timescales, as observed in the ring structures of eigenvalues.
7. Conclusion
ERDMD combines multistep DMD with entropic regression to autonomously discover a compact, informative set of nonuniform time delays for lagged dynamical modeling. Each variable selection step is grounded in Frobenius-norm least-squares augmented by conditional mutual information criteria. The resulting models achieve robust, interpretable, and spectrally enriched delay-DMD representations, validated across diverse nonlinear systems with significant reduction in model complexity and maintained or improved reconstruction fidelity (Curtis et al., 2024).