Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer
Abstract: In Chaos, a minor divergence between two initial conditions exhibits exponential amplification over time, leading to far-away outcomes, known as the butterfly effect. Thus, the distant future is full of uncertainty and hard to forecast. We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions. A reservoir is attached to a Transformer to efficiently handle arbitrarily long historical lengths, with an extension of a group of reservoirs to reduce the sensitivity to the initialization variations. Our architecture consistently outperforms state-of-the-art models in multivariate time series, including TimeLLM, GPT2TS, PatchTST, DLinear, TimeNet, and the baseline Transformer, with an error reduction of up to -59\% in various fields such as ETTh, ETTm, and air quality, demonstrating that an ensemble of butterfly learning can improve the adequacy and certainty of event prediction, despite of the traveling time to the unknown future.
- Unlimiformer: Long-range transformers with unlimited length input. arXiv preprint arXiv:2305.01625, 2023.
- Erik Bollt. On explaining the surprising success of reservoir computing forecaster of chaos? the universal machine learning dynamical system with contrast to var and dmd. Chaos: An Interdisciplinary Journal of Nonlinear Science, 31(1), 2021.
- Brits: Bidirectional recurrent imputation for time series. Advances in neural information processing systems, 31, 2018.
- Filling the gaps: Multivariate time series imputation by graph neural networks. arXiv preprint arXiv:2108.00298, 2021.
- Chaotic time series prediction using wavelet transform and multi-model hybrid method. Journal of Vibroengineering, 21(7):1983–1999, 2019.
- Echo state property of deep reservoir computing networks. Cognitive Computation, 9:337–350, 2017.
- Deep reservoir computing: A critical experimental analysis. Neurocomputing, 268:87–99, 2017.
- Next generation reservoir computing. Nature communications, 12(1):1–8, 2021.
- Characterization of strange attractors. Physical review letters, 50(5):346, 1983.
- Dsanet: Dual self-attention network for multivariate time series forecasting. In Proceedings of the 28th ACM international conference on information and knowledge management, pp. 2129–2132, 2019.
- Nonlinear ordinary differential equations: an introduction for scientists and engineers. OUP Oxford, 2007.
- Crude oil time series prediction model based on lstm network with chaotic henry gas solubility optimization. Energy, 242:122964, 2022.
- Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451, 2020.
- Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32, 2019.
- Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International Conference on Learning Representations, 2021.
- Zonghua Liu. Nonlinear time series: Computations and applications, 2010.
- Gregory P Meyer. An alternative probabilistic interpretation of the huber loss. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pp. 5261–5269, 2021.
- R Mohammadi Farsani and Ehsan Pazouki. A transformer self-attention model for time series forecasting. Journal of Electrical and Computer Engineering Innovations (JECEI), 9(1):1–10, 2020.
- imputets: time series missing value imputation in r. R J., 9(1):207, 2017.
- Detection of chaotic behavior in time series. arXiv preprint arXiv:2012.06671, 2020.
- PaperwithCodes. Leaderboards of time series forecasting, 2022.
- Chaotic time series forecasting approaches using machine learning techniques: A review. Symmetry, 14(5):955, 2022.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144, 2016.
- Long short-term memory (lstm) recurrent neural network for low-flow hydrological time series forecasting. Acta Geophysica, 67(5):1471–1481, 2019.
- Robustness of lstm neural networks for multi-step forecasting of chaotic time series. Chaos, Solitons & Fractals, 139:110045, 2020.
- Reservoir transformers. arXiv preprint arXiv:2012.15045, 2020.
- A hybrid prediction method based on empirical mode decomposition and multiple model fusion for chaotic time series. Chaos, Solitons & Fractals, 141:110366, 2020.
- Efficient transformers: A survey. ACM Computing Surveys, 55(6):1–28, 2022.
- Can non-linear readout nodes enhance the performance of reservoir-based speech recognizers? In 2011 First International Conference on Informatics and Computational Intelligence, pp. 262–267. IEEE, 2011.
- J. C. Vasquez-Correa. Python implementation to compute the correlation dimension of a time series. https://github.com/jcvasquezc/Corr_Dim, 2015. Accessed: 2022-11-15.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Deep transformer models for time series forecasting: The influenza prevalence case. arXiv preprint arXiv:2001.08317, 2020.
- Generalized lyapunov stability theory of continuous-time and discrete-time nonlinear distributed-order systems and its application to boundedness and attractiveness for networks models. Communications in Nonlinear Science and Numerical Simulation, 128:107664, 2024.
- St-mvl: filling missing values in geo-sensory time series data. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016.
- Estimating missing data in temporal data streams using multi-directional recurrent neural networks. IEEE Transactions on Biomedical Engineering, 66(5):1477–1490, 2018.
- Are transformers effective for time series forecasting? arXiv preprint arXiv:2205.13504, 2022.
- HG Zhao. Github homepage of keras self-attention, 2022.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, pp. 11106–11115, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. arXiv preprint arXiv:2201.12740, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.