MDMLP-EIA: Energy Invariant MLP for Forecasting
- The paper introduces an innovative decomposition and fusion approach that preserves energy invariance to enhance multivariate forecasting.
- It employs adaptive zero-initialized channel fusion and dynamic capacity adjustment, reducing parameter counts while boosting performance over Transformer models.
- Empirical results across benchmarks like ETTh1 and Solar demonstrate superior accuracy, stability, and noise robustness.
MDMLP-EIA (Multi-domain Dynamic MLPs with Energy Invariant Attention) is a neural architecture for multivariate time series forecasting that addresses critical deficiencies of prior MLP-based models—specifically, the loss of weak seasonal signals, inflexible capacity scaling across feature channels, and insufficient channel fusion. MDMLP-EIA introduces innovations across signal decomposition, feature fusion, attention design, and network scaling to deliver comparable or superior performance to Transformer-based models with reduced parameter count and computational demands (Zhang et al., 13 Nov 2025).
1. Architectural Innovations in MDMLP-EIA
MDMLP-EIA is characterized by three components: (i) an adaptive dual-domain seasonal MLP with zero-initialized channel fusion, (ii) an energy invariant attention (EIA) fusion mechanism, and (iii) a dynamic capacity adjustment (DCA) strategy that scales hidden dimensions with task complexity. These advances enable precise modeling of heterogeneous seasonal patterns, strict control of signal amplification, and adaptive model capacity proportional to channel count.
2. Decomposition and Dual-Domain MLP Module
The starting point of MDMLP-EIA is a decomposition of the input series , where is sequence length and is the number of channels. Normalization via Reversible Instance Normalization (RevIN) is applied, followed by exponential moving average (EMA) decomposition that yields the trend component and the raw seasonal-plus-noise component .
Subsequently, channel-major format is obtained by permutation for channel-independent MLP processing: . The branch is split into two parallel learning paths:
- Strong Seasonal Path (Frequency Domain): is embedded, then processed via real FFT; the spectrum passes through a frequency MLP (FreMLP), is reconstructed by inverse FFT, and predicted via a channel-independent MLP.
- Weak Seasonal Path (Time Domain): is input directly to a weak seasonal MLP.
The resulting strong () and weak () seasonal predictions are fused additively using per-channel weights.
3. Adaptive Zero-Initialized Channel Fusion (AZCF)
AZCF fuses strong and weak seasonal predictions using a per-channel fusion coefficient . The fusion operation is , with weights initialized at zero. This design ensures initially, and as training progresses, increases only if improves the training objective (see the proposition in Appendix E of (Zhang et al., 13 Nov 2025)). This results in strict error reduction and robustly suppresses noisy or spurious channel contributions due to the initial zero weighting.
4. Energy Invariant Attention (EIA) Fusion Mechanism
EIA merges the trend () and fused seasonal () predictions into while ensuring the overall energy (signal power) of the prediction matches that of the original normalized sequence. Given , the total energy is defined as . The attention vector is computed from concatenated MLP outputs (, ) by a stack of two linear layers with GeLU, dropout, and sigmoid activation.
The fusion is performed via:
This operation ensures convex weighting and restores the correct energy magnitude even as the channel mixture varies adaptively per prediction step. When , ; for other , the energy is preserved by the factor of 2. Theoretical analysis guarantees that EIA is non-inferior to direct summation, and may strictly outperform it (Appendix F of (Zhang et al., 13 Nov 2025)).
5. Dynamic Capacity Adjustment (DCA) for Channel-Independent MLPs
DCA directly links the hidden neuron count in channel-independent MLPs to the channel dimension using the scaling coefficient (with ). The number of neurons for trend, strong seasonal, and weak seasonal MLP branches are assigned as follows:
- (with )
This sublinear scaling ensures adequate capacity on high-dimensional data while maintaining parameter efficiency and minimizing overfitting on small-channel tasks. Ablation studies on datasets such as ETTh1, Solar, Traffic, and Weather confirm improved performance over fixed hidden-size baselines.
6. Training Strategies and Empirical Evaluation
The end-to-end model processes input by normalizing, decomposing, permuting, dynamically scaling model capacity, and performing the described sequence of trend, seasonal, and attention-based fusion computations, followed by an inverse RevIN to revert normalization. The training objective is the arctangent loss (as in xPatch), with AdamW optimizer, sigmoid learning-rate schedule, and dropout regularization.
MDMLP-EIA exhibits state-of-the-art performance on nine multivariate time series benchmarks. For example, averaged across forecast lengths and unified , MDMLP-EIA outperforms xPatch by -2.91% MSE and -1.37% MAE, and outperforms Amplifier by -4.14% MSE and -2.29% MAE, achieving best MSE on 5/9 datasets and best MAE on 7/9. When tuning per model, the gains are further improved. The parameter and memory footprint are lower than comparable Transformer models (e.g., on the Exchange dataset, 128K parameters and 336MB memory versus iTransformer's 224K and 346MB; on the Electricity dataset, 1.9M parameters and 0.44GB versus FreTS's 3.2M and 6.7GB).
Noise robustness is demonstrated: energy-preservation in EIA and AZCF limits amplification of noisy channels, resulting in stable performance under synthetic disturbances.
7. Theoretical Properties and Significance
MDMLP-EIA is supported by mathematical proofs that AZCF strictly reduces error and EIA is non-inferior to additive fusion, with potential for strict accuracy improvements. The design facilitates recovery of both strong and weak seasonality, avoidance of over- or under-parameterization, and channel-adaptive weighting. A plausible implication is that these innovations could generalize to other domains where robust, energy-preserving aggregation and structural scaling in MLPs are critical.
In summary, the MDMLP-EIA framework constitutes a theoretically grounded and computationally efficient solution for multivariate time series forecasting, achieving consistent high performance with reduced computational demands and increased stability across diverse channel scales (Zhang et al., 13 Nov 2025).