Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer (2312.05792v1)

Published 10 Dec 2023 in cs.LG and cs.AI

Abstract: With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies. Currently, Transformer and MLP are two paradigms for deep time-series forecasting and the former one is more prevailing in virtue of its exquisite attention mechanism and encoder-decoder architecture. However, data scientists seem to be more willing to dive into the research of encoder, leaving decoder unconcerned. Some researchers even adopt linear projections in lieu of the decoder to reduce the complexity. We argue that both extracting the features of input sequence and seeking the relations of input and prediction sequence, which are respective functions of encoder and decoder, are of paramount significance. Motivated from the success of FPN in CV field, we propose FPPformer to utilize bottom-up and top-down architectures respectively in encoder and decoder to build the full and rational hierarchy. The cutting-edge patch-wise attention is exploited and further developed with the combination, whose format is also different in encoder and decoder, of revamped element-wise attention in this work. Extensive experiments with six state-of-the-art baselines on twelve benchmarks verify the promising performances of FPPformer and the importance of elaborately devising decoder in time-series forecasting Transformer. The source code is released in https://github.com/OrigamiSL/FPPformer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. J. Han, G. H. Lee, S. Park, J. Lee, and J. K. Choi, “A multivariate-time-series-prediction-based adaptive data transmission period control algorithm for iot networks,” IEEE Internet of Things Journal, vol. 9, no. 1, pp. 419–436, 2022.
  2. G. E. P. Box, G. M. Jenkins, and J. F. MacGregor, “Some recent advances in forecasting and control,” Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 23, no. 2, pp. 158–179, 1974. [Online]. Available: https://rss.onlinelibrary.wiley.com/doi/abs/10.2307/2346997
  3. K. Benidis, S. S. Rangapuram, V. Flunkert, Y. Wang, D. Maddix, C. Turkmen, J. Gasthaus, M. Bohlke-Schneider, D. Salinas, L. Stella, F.-X. Aubet, L. Callot, and T. Januschowski, “Deep learning for time series forecasting: Tutorial and literature survey,” ACM Comput. Surv., vol. 55, no. 6, dec 2022. [Online]. Available: https://doi.org/10.1145/3533382
  4. A. Zeng, M. Chen, L. Zhang, and Q. Xu, “Are transformers effective for time series forecasting?” 2023.
  5. T. Zhou, Z. Ma, xue wang, Q. Wen, L. Sun, T. Yao, W. Yin, and R. Jin, “FiLM: Frequency improved legendre memory model for long-term time series forecasting,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=zTQdHSQUQWc
  6. C. Challu, K. G. Olivares, B. N. Oreshkin, F. Garza, M. Mergenthaler-Canseco, and A. Dubrawski, “Nhits: Neural hierarchical interpolation for time series forecasting,” 2023.
  7. Y. Zhang and J. Yan, “Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=vSVLM2j9eie
  8. M. A. Shabani, A. H. Abdi, L. Meng, and T. Sylvain, “Scaleformer: Iterative multi-scale refining transformers for time series forecasting,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=sCrnllCtjoE
  9. R.-G. Cirstea, C. Guo, B. Yang, T. Kieu, X. Dong, and S. Pan, “Triformer: Triangular, variable-specific attentions for long sequence multivariate time series forecasting,” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, L. D. Raedt, Ed.   International Joint Conferences on Artificial Intelligence Organization, 7 2022, pp. 1994–2001, main Track. [Online]. Available: https://doi.org/10.24963/ijcai.2022/277
  10. H. Wang, J. Peng, F. Huang, J. Wang, J. Chen, and Y. Xiao, “MICN: Multi-scale local and global context modeling for long-term series forecasting,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=zt53IDUR1U
  11. T. Kim, J. Kim, Y. Tae, C. Park, J.-H. Choi, and J. Choo, “Reversible instance normalization for accurate time-series forecasting against distribution shift,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=cGDAkQo1C0p
  12. Z. Liu, R. Godahewa, K. Bandara, and C. Bergmeir, “Handling concept drift in global time series forecasting,” ArXiv, vol. abs/2304.01512, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257921602
  13. S. B. Taieb and A. F. Atiya, “A bias and variance analysis for multistep-ahead time series forecasting,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 1, pp. 62–76, 2016.
  14. H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in The Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Virtual Conference, vol. 35, no. 12.   AAAI Press, 2021, pp. 11 106–11 115.
  15. Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Jbdc0vTOcol
  16. Y. Liu, H. Wu, J. Wang, and M. Long, “Non-stationary transformers: Exploring the stationarity in time series forecasting,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=ucNDIDRNjjv
  17. S. Liu, H. Yu, C. Liao, J. Li, W. Lin, A. X. Liu, and S. Dustdar, “Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=0EXmFzUn5I
  18. T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936–944.
  19. S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 8759–8768.
  20. B. Huang, H. Dou, Y. Luo, J. Li, J. Wang, and T. Zhou, “Adaptive spatiotemporal transformer graph network for traffic flow forecasting by iot loop detectors,” IEEE Internet of Things Journal, vol. 10, no. 2, pp. 1642–1653, 2023.
  21. Y. Jiang, S. Niu, K. Zhang, B. Chen, C. Xu, D. Liu, and H. Song, “Spatial–temporal graph data mining for iot-enabled air mobility prediction,” IEEE Internet of Things Journal, vol. 9, no. 12, pp. 9232–9240, 2022.
  22. S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.-X. Wang, and X. Yan, “Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32.   Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/6775a0635c302542da2c32aa19d86be0-Paper.pdf
  23. H. Wu, J. Xu, J. Wang, and M. Long, “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34.   Curran Associates, Inc., 2021, pp. 22 419–22 430. [Online]. Available: https://proceedings.neurips.cc/paper/2021/file/bcc0d400288793e8bdcd7c19a8ac0c2b-Paper.pdf
  24. T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin, “FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting,” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162.   PMLR, 17–23 Jul 2022, pp. 27 268–27 286. [Online]. Available: https://proceedings.mlr.press/v162/zhou22g.html
  25. G. Woo, C. Liu, D. Sahoo, A. Kumar, and S. C. H. Hoi, “Etsformer: Exponential smoothing transformers for time-series forecasting,” ArXiv, vol. abs/2202.01381, 2022.
  26. G. Lai, W.-C. Chang, Y. Yang, and H. Liu, “Modeling long- and short-term temporal patterns with deep neural networks,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, ser. SIGIR ’18.   New York, NY, USA: Association for Computing Machinery, 2018, p. 95–104. [Online]. Available: https://doi.org/10.1145/3209978.3210006
  27. M. LIU, A. Zeng, M. Chen, Z. Xu, Q. LAI, L. Ma, and Q. Xu, “SCINet: Time series modeling and forecasting with sample convolution and interaction,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=AyajSjTAzmg
  28. B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, “N-beats: Neural basis expansion analysis for interpretable time series forecasting,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=r1ecqn4YwB
  29. V. Ekambaram, A. Jati, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “Tsmixer: Lightweight mlp-mixer model for multivariate time series forecasting,” Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:259187817
  30. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30.   Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  31. W. Fan, P. Wang, D. Wang, D. Wang, Y. Zhou, and Y. Fu, “Dish-ts: A general paradigm for alleviating distribution shift in time series forecasting,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 6, 2023, pp. 7522–7529.
  32. S. Chen, G. Long, T. Shen, and J. Jiang, “Prompt federated learning for weather forecasting: Toward foundation models on meteorological data,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, E. Elkind, Ed.   International Joint Conferences on Artificial Intelligence Organization, 8 2023, pp. 3532–3540, main Track. [Online]. Available: https://doi.org/10.24963/ijcai.2023/393
  33. H. Xue and F. D. Salim, “PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting,” arXiv e-prints, p. arXiv:2210.08964, Sep. 2022.
  34. K.-H. Lai, D. Zha, J. Xu, Y. Zhao, G. Wang, and X. Hu, “Revisiting time series outlier detection: Definitions and benchmarks,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021. [Online]. Available: https://openreview.net/forum?id=r8IvOsnHchr
  35. G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A transformer-based framework for multivariate time series representation learning,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, ser. KDD ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 2114–2124. [Online]. Available: https://doi.org/10.1145/3447548.3467401
  36. Z. You, L. Cui, Y. Shen, K. Yang, X. Lu, Y. Zheng, and X. Le, “A unified model for multi-class anomaly detection,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=bMYU8_qD8PW
  37. Z. Yue, Y. Wang, J. Duan, T. Yang, C. Huang, Y. Tong, and B. Xu, “Ts2vec: Towards universal representation of time series,” in AAAI, 2022.
  38. G. Woo, C. Liu, D. Sahoo, A. Kumar, and S. Hoi, “CoST: Contrastive learning of disentangled seasonal-trend representations for time series forecasting,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=PilZY3omXV2
  39. A. Trindade, “ElectricityLoadDiagrams20112014,” UCI Machine Learning Repository, 2015, DOI: https://doi.org/10.24432/C58C86.
  40. (n.d.) Caltrans pems. [Online]. Available: http://pems.dot.ca.gov/
  41. (n.d.) Max-planck-institut fuer biogeochemie - wetterdaten. [Online]. Available: http://pems.dot.ca.gov/
  42. (n.d.) Solar power data for integration studies. [Online]. Available: https://www.nrel.gov/grid/solar-power-data.html
  43. S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m4 competition: Results, findings, conclusion and way forward,” International Journal of Forecasting, vol. 34, no. 4, pp. 802–808, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0169207018300785
  44. Y. Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “itransformer: Inverted transformers are effective for time series forecasting,” ArXiv, vol. abs/2310.06625, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:263830644
  45. L. van der Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008. [Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.html
  46. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9992–10 002.
Citations (4)

Summary

We haven't generated a summary for this paper yet.