Long-term Forecasting with TiDE: Time-series Dense Encoder (2304.08424v5)
Abstract: Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and non-linear dependencies. Theoretically, we prove that the simplest linear analogue of our model can achieve near optimal error rate for linear dynamical systems (LDS) under some assumptions. Empirically, we show that our method can match or outperform prior approaches on popular long-term time-series forecasting benchmarks while being 5-10x faster than the best Transformer based model.
- Martín Abadi. Tensorflow: learning functions at scale. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, pages 1–1, 2016.
- Gluonts: Probabilistic and neural time series modeling in python. The Journal of Machine Learning Research, 21(1):4629–4634, 2020.
- On the benefits of maximum likelihood estimation for regression and forecasting. In International Conference on Learning Representations, 2021.
- Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3(Nov):463–482, 2002.
- Some recent advances in forecasting and control. Journal of the Royal Statistical Society. Series C (Applied Statistics), 17(2):91–109, 1968.
- Time series analysis: forecasting and control. John Wiley & Sons, 2015.
- NHITS: Neural Hierarchical Interpolation for Time Series forecasting. In The Association for the Advancement of Artificial Intelligence Conference 2023 (AAAI 2023), 2023. URL https://arxiv.org/abs/2201.12886.
- Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
- Tilmann Gneiting. Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494):746–762, 2011.
- Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations.
- Diagonal state spaces are as effective as structured state spaces. Advances in Neural Information Processing Systems, 35:22982–22994, 2022.
- Learning linear dynamical systems via spectral filtering. Advances in Neural Information Processing Systems, 30, 2017.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Rnns incrementally evolving on an equilibrium manifold: A panacea for vanishing and exploding gradients? In International Conference on Learning Representations, 2020.
- Rudolf Emil Kalman. Mathematical description of linear dynamical systems. Journal of the Society for Industrial and Applied Mathematics, Series A: Control, 1(2):152–192, 1963.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- The surprising efficiency of framing geo-spatial time series forecasting as a video prediction task–insights from the iarai traffic4cast competition at neurips 2019. In NeurIPS 2019 Competition and Demonstration Track, pages 232–241. PMLR, 2020.
- Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32, 2019a.
- Deep independently recurrent neural network (indrnn). arXiv preprint arXiv:1910.06251, 2019b.
- Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations, 2021.
- Time-adaptive recurrent neural networks. arXiv preprint arXiv:2204.05192, 2022.
- The m5 accuracy competition: Results, findings and conclusions. Int J Forecast, 2020.
- M5 accuracy competition: Results, findings, and conclusions. International Journal of Forecasting, 38(4):1346–1364, 2022.
- ED McKenzie. General exponential smoothing and the equivalent arma process. Journal of Forecasting, 3(3):333–344, 1984.
- A time series is worth 64 words: Long-term forecasting with transformers. International conference on learning representations, 2022.
- N-beats: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations.
- Coupled oscillatory recurrent neural network (cornn): An accurate and (gradient) stable architecture for learning long time dependencies. arXiv preprint arXiv:2010.00951, 2020.
- Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3):1181–1191, 2020.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768, 2020.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Are transformers effective for time series forecasting? Proceedings of the AAAI conference on artificial intelligence, 2023.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pages 27268–27286. PMLR, 2022.
- Vector autoregressive models for multivariate time series. Modeling financial time series with S-PLUS®, pages 385–429, 2006.