Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba (2410.11278v1)

Published 15 Oct 2024 in cs.LG

Abstract: Multivariate Time series forecasting is crucial in domains such as transportation, meteorology, and finance, especially for predicting extreme weather events. State-of-the-art methods predominantly rely on Transformer architectures, which utilize attention mechanisms to capture temporal dependencies. However, these methods are hindered by quadratic time complexity, limiting the model's scalability with respect to input sequence length. This significantly restricts their practicality in the real world. Mamba, based on state space models (SSM), provides a solution with linear time complexity, increasing the potential for efficient forecasting of sequential data. In this study, we propose UmambaTSF, a novel long-term time series forecasting framework that integrates multi-scale feature extraction capabilities of U-shaped encoder-decoder multilayer perceptrons (MLP) with Mamba's long sequence representation. To improve performance and efficiency, the Mamba blocks introduced in the framework adopt a refined residual structure and adaptable design, enabling the capture of unique temporal signals and flexible channel processing. In the experiments, UmambaTSF achieves state-of-the-art performance and excellent generality on widely used benchmark datasets while maintaining linear time complexity and low memory consumption.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. M. Liu, A. Zeng, M. Chen, Z. Xu, Q. Lai, L. Ma, and Q. Xu, “Scinet: Time series modeling and forecasting with sample convolution and interaction,” Advances in Neural Information Processing Systems, vol. 35, pp. 5816–5828, 2022.
  2. J. Su, D. Xie, Y. Duan, Y. Zhou, X. Hu, and S. Duan, “Mdcnet: Long-term time series forecasting with mode decomposition and 2d convolution,” Knowledge-Based Systems, p. 111986, 2024.
  3. D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski, “Deepar: Probabilistic forecasting with autoregressive recurrent networks,” International journal of forecasting, vol. 36, no. 3, pp. 1181–1191, 2020.
  4. A. Vaswani, “Attention is all you need,” Advances in Neural Information Processing Systems, 2017.
  5. Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” in The Eleventh International Conference on Learning Representations, 2023. [Online].
  6. H. Wu, J. Xu, J. Wang, and M. Long, “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting,” Advances in neural information processing systems, vol. 34, pp. 22 419–22 430, 2021.
  7. T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin, “Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting,” in International conference on machine learning.   PMLR, 2022, pp. 27 268–27 286.
  8. Y. Liu, H. Wu, J. Wang, and M. Long, “Non-stationary transformers: Exploring the stationarity in time series forecasting,” Advances in Neural Information Processing Systems, vol. 35, pp. 9881–9893, 2022.
  9. Y. Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “itransformer: Inverted transformers are effective for time series forecasting,” in The Twelfth International Conference on Learning Representations, 2024.
  10. H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 12, 2021, pp. 11 106–11 115.
  11. S. Liu, H. Yu, C. Liao, J. Li, W. Lin, A. X. Liu, and S. Dustdar, “Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting,” in International conference on learning representations, 2021.
  12. H. Wu, J. Wu, J. Xu, J. Wang, and M. Long, “Flowformer: Linearizing transformers with conservation flows” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162.   PMLR, 17–23 Jul 2022, pp. 24 226–24 242.
  13. A. Zeng, M. Chen, L. Zhang, and Q. Xu, “Are transformers effective for time series forecasting?” in Proceedings of the AAAI conference on artificial intelligence, vol. 37, no. 9, 2023, pp. 11 121–11 128.
  14. Z. Li, S. Qi, Y. Li, and Z. Xu, “Revisiting long-term time series forecasting: An investigation on linear mapping,” arXiv preprint arXiv:2305.10721, 2023.
  15. Z. Wang, S. Ruan, T. Huang, H. Zhou, S. Zhang, Y. Wang, L. Wang, Z. Huang, and Y. Liu, “A lightweight multi-layer perceptron for efficient multivariate time series forecasting,” Knowledge-Based Systems, vol. 288, p. 111463, 2024.
  16. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
  17. Z. Wang, F. Kong, S. Feng, M. Wang, H. Zhao, D. Wang, and Y. Zhang, “Is mamba effective for time series forecasting?” arXiv preprint arXiv:2403.11144, 2024.
  18. M. A. Ahamed and Q. Cheng, “Timemachine: A time series is worth 4 mambas for long-term forecasting,” arXiv preprint arXiv:2403.09898, 2024.
  19. A. Behrouz and F. Hashemi, “Graph mamba: Towards learning on graphs with state space models,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024, pp. 119–130.
  20. L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” in Forty-first International Conference on Machine Learning, 2024.
  21. J. Liu, H. Yang, H.-Y. Zhou, Y. Xi, L. Yu, Y. Yu, Y. Liang, G. Shi, S. Zhang, H. Zheng et al., “Swin-umamba: Mamba-based unet with imagenet-based pretraining,” arXiv preprint arXiv:2402.03302, 2024.
  22. B. N. Patro and V. S. Agneeswaran, “Simba: Simplified mamba-based architecture for vision and multivariate time series,” arXiv preprint arXiv:2403.15360, 2024.
  23. H. Zhao, M. Zhang, W. Zhao, P. Ding, S. Huang, and D. Wang, “Cobra: Extending mamba to multi-modal large language model for efficient inference,” arXiv preprint arXiv:2403.14520, 2024.
  24. Y. Qiao, Z. Yu, L. Guo, S. Chen, Z. Zhao, M. Sun, Q. Wu, and J. Liu, “Vl-mamba: Exploring state space models for multimodal learning,” arXiv preprint arXiv:2403.13600, 2024.
  25. A. Ali, I. Zimerman, and L. Wolf, “The hidden attention of mamba models,” arXiv preprint arXiv:2403.01590, 2024.
  26. R. A. Angryk, P. C. Martens, B. Aydin, D. Kempton, S. S. Mahajan, S. Basodi, A. Ahmadzadeh, X. Cai, S. Filali Boubrahimi, S. M. Hamdi et al., “Multivariate time series dataset for space weather data analytics,” Scientific data, vol. 7, no. 1, p. 227, 2020.
  27. H. Wang, J. Peng, F. Huang, J. Wang, J. Chen, and Y. Xiao, “MICN: Multi-scale local and global context modeling for long-term series forecasting,” in The Eleventh International Conference on Learning Representations, 2023.
  28. S. Feng, C. Miao, K. Xu, J. Wu, P. Wu, Y. Zhang, and P. Zhao, “Multi-scale attention flow for probabilistic time series forecasting,” IEEE Transactions on Knowledge and Data Engineering, 2023.
  29. Z. Yue, Y. Wang, J. Duan, T. Yang, C. Huang, Y. Tong, and B. Xu, “Ts2vec: Towards universal representation of time series,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 8, 2022, pp. 8980–8987.
  30. T. Dai, B. Wu, P. Liu, N. Li, J. Bao, Y. Jiang, and S.-T. Xia, “Periodicity decoupling framework for long-term series forecasting,” in The Twelfth International Conference on Learning Representations, 2024.
  31. T. Zhou, P. Niu, X. Wang, L. Sun, and R. Jin, “One fits all: Power general time series analysis by pretrained LM,” in Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  32. B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, “N-beats: Neural basis expansion analysis for interpretable time series forecasting,” in International Conference on Learning Representations, 2020.
  33. S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.-X. Wang, and X. Yan, “Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting,” Advances in neural information processing systems, vol. 32, 2019.
  34. Y. Zhang and J. Yan, “Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting,” in The Eleventh International Conference on Learning Representations, 2023.
  35. V. Ekambaram, A. Jati, N. Nguyen, P. Sinthong, and J. Kalagnanam, “Tsmixer: Lightweight mlp-mixer model for multivariate time series forecasting,” in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 459–469.
  36. A. Das, W. Kong, A. Leach, S. K. Mathur, R. Sen, and R. Yu, “Long-term forecasting with tiDE: Time-series dense encoder,” Transactions on Machine Learning Research, 2023.
  37. A. Gu, K. Goel, and C. Ré, “Efficiently modeling long sequences with structured state spaces,” arXiv preprint arXiv:2111.00396, 2021.
  38. J. T. Smith, A. Warrington, and S. W. Linderman, “Simplified state space layers for sequence modeling,” arXiv preprint arXiv:2208.04933, 2022.
  39. T. Kim, J. Kim, Y. Tae, C. Park, J.-H. Choi, and J. Choo, “Reversible instance normalization for accurate time-series forecasting against distribution shift,” in International Conference on Learning Representations, 2022.
  40. X. Ma, X. Li, L. Fang, T. Zhao, and C. Zhang, “U-mixer: An unet-mixer architecture with stationarity correction for time series forecasting,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 13, 2024, pp. 14 255–14 262.
  41. S. Elfwing, E. Uchibe, and K. Doya, “Sigmoid-weighted linear units for neural network function approximation in reinforcement learning,” Neural networks, vol. 107, pp. 3–11, 2018.
  42. M. Hou, C. Xu, Z. Li, Y. Liu, W. Liu, E. Chen, and J. Bian, “Multi-granularity residual learning with confidence estimation for time series prediction,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 112–121.
  43. H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, and M. Long, “Timesnet: Temporal 2d-variation modeling for general time series analysis,” in The Eleventh International Conference on Learning Representations, 2023.

Summary

We haven't generated a summary for this paper yet.