U-Mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting (2401.02236v1)
Abstract: Time series forecasting is a crucial task in various domains. Caused by factors such as trends, seasonality, or irregular fluctuations, time series often exhibits non-stationary. It obstructs stable feature propagation through deep layers, disrupts feature distributions, and complicates learning data distribution changes. As a result, many existing models struggle to capture the underlying patterns, leading to degraded forecasting performance. In this study, we tackle the challenge of non-stationarity in time series forecasting with our proposed framework called U-Mixer. By combining Unet and Mixer, U-Mixer effectively captures local temporal dependencies between different patches and channels separately to avoid the influence of distribution variations among channels, and merge low- and high-levels features to obtain comprehensive data representations. The key contribution is a novel stationarity correction method, explicitly restoring data distribution by constraining the difference in stationarity between the data before and after model processing to restore the non-stationarity information, while ensuring the temporal dependencies are preserved. Through extensive experiments on various real-world time series datasets, U-Mixer demonstrates its effectiveness and robustness, and achieves 14.5\% and 7.7\% improvements over state-of-the-art (SOTA) methods.
- An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271.
- NHITS: Neural Hierarchical Interpolation for Time Series Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 6989–6997.
- Cohen, L. 1998. The generalization of the Wiener-Khinchin theorem. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), volume 3, 1577–1580. IEEE.
- Fukushima, K. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4): 193–202.
- Efficiently Modeling Long Sequences with Structured State Spaces. In The International Conference on Learning Representations (ICLR).
- CATN: Cross attentive tree-aware network for multivariate time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 4030–4038.
- Long short-term memory. Neural computation, 9(8): 1735–1780.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations.
- Reformer: The Efficient Transformer. In International Conference on Learning Representations.
- Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, 95–104.
- Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32.
- Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations.
- A stock price prediction method based on meta-learning and variational mode decomposition. Knowledge-Based Systems, 252: 109324.
- Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting.
- Fuzzy hypergraph network for recommending top-K profitable stocks. Information Sciences, 613: 239–255.
- A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In International Conference on Learning Representations.
- N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations.
- Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems, 34: 24261–24272.
- Attention is all you need. Advances in neural information processing systems, 30.
- Etsformer: Exponential smoothing transformers for time-series forecasting. arXiv preprint arXiv:2202.01381.
- TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. In International Conference on Learning Representations.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34: 22419–22430.
- Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, 11121–11128.
- DFNet: Decomposition fusion model for long sequence time-series forecasting. Knowledge-Based Systems, 277: 110794.
- Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186.
- MPR-Net:Multi-Scale Pattern Reproduction Guided Universality Time Series Interpretable Forecasting. arXiv:2307.06736.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, 11106–11115.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, 27268–27286. PMLR.
- Xiang Ma (95 papers)
- Xuemei Li (18 papers)
- Lexin Fang (3 papers)
- Tianlong Zhao (4 papers)
- Caiming Zhang (14 papers)