Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Foundation Models for Time Series Analysis: A Tutorial and Survey (2403.14735v3)

Published 21 Mar 2024 in cs.LG

Abstract: Time series analysis stands as a focal point within the data mining community, serving as a cornerstone for extracting valuable insights crucial to a myriad of real-world applications. Recent advances in Foundation Models (FMs) have fundamentally reshaped the paradigm of model design for time series analysis, boosting various downstream tasks in practice. These innovative approaches often leverage pre-trained or fine-tuned FMs to harness generalized knowledge tailored for time series analysis. This survey aims to furnish a comprehensive and up-to-date overview of FMs for time series analysis. While prior surveys have predominantly focused on either application or pipeline aspects of FMs in time series analysis, they have often lacked an in-depth understanding of the underlying mechanisms that elucidate why and how FMs benefit time series analysis. To address this gap, our survey adopts a methodology-centric classification, delineating various pivotal elements of time-series FMs, including model architectures, pre-training techniques, adaptation methods, and data modalities. Overall, this survey serves to consolidate the latest advancements in FMs pertinent to time series analysis, accentuating their theoretical underpinnings, recent strides in development, and avenues for future exploration.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (121)
  1. Chronos: Learning the Language of Time Series. arXiv preprint arXiv:2403.07815 (2024).
  2. Vivit: A video vision transformer. arXiv preprint arXiv:2103.15691 (2021).
  3. Foundational Models Defining a New Era in Vision: A Survey and Outlook. arXiv preprint arXiv:2307.13721 (2023).
  4. Accurate medium-range global weather forecasting with 3D neural networks. Nature (2023), 1–6.
  5. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619, 7970 (2023), 533–538.
  6. Modeling temporal data as continuous functions with process diffusion. arXiv preprint arXiv:2211.02590 (2022).
  7. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021).
  8. Video generation models as world simulators. (2024). https://openai.com/research/video-generation-models-as-world-simulators
  9. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  10. DYffusion: A Dynamics-informed Diffusion Model for Spatiotemporal Forecasting. arXiv preprint arXiv:2306.01984 (2023).
  11. TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting. arXiv preprint arXiv:2310.04948 (2023).
  12. LLM4TS: Two-Stage Fine-Tuning for Time-Series Forecasting with Pre-Trained LLMs. arXiv preprint arXiv:2308.08469 (2023).
  13. FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead. arXiv preprint arXiv:2304.02948 (2023).
  14. Prompt Federated Learning for Weather Forecasting: Toward Foundation Models on Meteorological Data. In International Joint Conference on Artificial Intelligence.
  15. Spatial-temporal Prompt Learning for Federated Weather Forecasting. arXiv preprint arXiv:2305.14244 (2023).
  16. Spatial-temporal Prompt Learning for Federated Weather Forecasting. arXiv:2305.14244 [cs.LG]
  17. A Simple Framework for Contrastive Learning of Visual Representations. In ICML, Vol. 119. 1597–1607.
  18. Gatgpt: A pre-trained large language model with graph attention network for spatiotemporal imputation. arXiv preprint arXiv:2311.14332 (2023).
  19. ChatGPT Informed Graph Neural Network for Stock Movement Prediction. arXiv preprint arXiv:2306.03763 (2023).
  20. Simulating human mobility with a trajectory generation framework based on diffusion model. International Journal of Geographical Information Science (2024), 1–32.
  21. On the constrained time-series generation problem. Advances in Neural Information Processing Systems 36 (2024).
  22. Time Series Diffusion in the Frequency Domain. arXiv preprint arXiv:2402.05933 (2024).
  23. A decoder-only foundation model for time-series forecasting. arXiv preprint arXiv:2310.10688 (2023).
  24. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171–4186.
  25. TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling. arXiv preprint arXiv:2402.02475 (2024).
  26. SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling. Advances in Neural Information Processing Systems (2023).
  27. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
  28. A survey of vision-language pre-trained models. arXiv preprint arXiv:2202.10936 (2022).
  29. Adarnn: Adaptive learning and forecasting of time series. In Proceedings of the 30th ACM international conference on information & knowledge management. 402–411.
  30. Pre-Trained Bidirectional Temporal Representation for Crowd Flows Prediction in Regular Region. IEEE Access 7 (2019), 143855–143865.
  31. TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting. arXiv preprint arXiv:2306.09364 (2023).
  32. TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series. arXiv preprint arXiv:2401.03955 (2024).
  33. Tao-Yang Fu and Wang-Chien Lee. 2020. Trembr: Exploring Road Networks for Trajectory Representation Learning. ACM Trans. Intell. Syst. Technol. 11, 1 (2020), 10:1–10:25.
  34. UniTS: Building a Unified Time Series Model. arXiv preprint arXiv:2403.00131 (2024).
  35. Azul Garza and Max Mergenthaler-Canseco. 2023. TimeGPT-1. arXiv preprint arXiv:2310.03589 (2023).
  36. Large Language Models Are Zero-Shot Time Series Forecasters. Advances in Neural Information Processing Systems (2023).
  37. Albert Gu and Tri Dao. 2023. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv:2312.00752 [cs.LG]
  38. Anisha Gunjal and Greg Durrett. 2023. Drafting Event Schemas using Language Models. arXiv preprint arXiv:2305.14847 (2023).
  39. Recurrent neural networks for time series forecasting: Current status and future directions. International Journal of Forecasting 37, 1 (2021), 388–427.
  40. Haowen Hou and F Richard Yu. 2024. RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks. arXiv preprint arXiv:2401.09093 (2024).
  41. Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3451–3460.
  42. Towards Unifying Diffusion Models for Probabilistic Spatio-Temporal Graph Learning. arXiv:2310.17360 [cs.LG]
  43. Generative Learning for Financial Time Series with Irregular and Scale-Invariant Patterns. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=CdjnzWsQax
  44. Self-supervised Trajectory Representation Learning with Temporal Regularities and Travel Semantics. CoRR abs/2211.09510 (2022).
  45. Health system-scale language models are all-purpose prediction engines. Nature (2023), 1–6.
  46. Empowering Time Series Analysis with Large Language Models: A Survey. arXiv preprint arXiv:2402.03182 (2024).
  47. A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection. arXiv preprint arXiv:2307.03759 (2023).
  48. Time-LLM: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728 (2023).
  49. Large models for time series and spatio-temporal data: A survey and outlook. arXiv preprint arXiv:2310.10196 (2023).
  50. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations.
  51. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015–4026.
  52. Frozen Language Model Helps ECG Zero-Shot Learning. In Medical Imaging with Deep Learning.
  53. Mining spatio-temporal relations via self-paced graph contrastive learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 936–944.
  54. Generative time series forecasting with diffusion, denoise, and disentanglement. Advances in Neural Information Processing Systems 35 (2022), 23009–23022.
  55. GTM: General Trajectory Modeling with Auto-regressive Generation of Feature Domains. arXiv:2402.07232 [cs.LG]
  56. Pre-training General Trajectory Embeddings with Maximum Multi-view Entropy Coding. arXiv:2207.14539 [cs.CV]
  57. Spatial-temporal large language model for traffic prediction. arXiv preprint arXiv:2401.10134 (2024).
  58. PriSTI: A Conditional Diffusion Framework for Spatiotemporal Imputation. arXiv preprint arXiv:2302.09746 (2023).
  59. When do contrastive learning signals help spatio-temporal graph forecasting?. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems. 1–12.
  60. Large Language Models are Few-Shot Health Learners. arXiv preprint arXiv:2305.15525 (2023).
  61. AutoTimes: Autoregressive Time Series Forecasters via Large Language Models. arXiv preprint arXiv:2402.02370 (2024).
  62. Timer: Transformers for Time Series Analysis at Scale. arXiv preprint arXiv:2402.02368 (2024).
  63. W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting. arXiv preprint arXiv:2304.08754 (2023).
  64. A Survey of Deep Learning and Foundation Models for Time Series Forecasting. arXiv:2401.13912 [cs.LG]
  65. Applications of Machine Learning in Wealth Management. Journal of Investment Consulting 21, 1 (2022), 66–82.
  66. ClimaX: A foundation model for weather and climate. International Conference on Machine Learning (2023).
  67. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730 (2022).
  68. Contrastive learning for unsupervised domain adaptation of time series. arXiv preprint arXiv:2206.06243 (2022).
  69. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv preprint arXiv:2202.11214 (2022).
  70. William Peebles and Saining Xie. 2023. Scalable diffusion models with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4195–4205.
  71. Rwkv: Reinventing rnns for the transformer era. arXiv preprint arXiv:2305.13048 (2023).
  72. Personalized federated darts for electricity load forecasting of individual buildings. IEEE Transactions on Smart Grid (2023).
  73. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
  74. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  75. Lag-llama: Towards foundation models for time series forecasting. arXiv preprint arXiv:2310.08278 (2023).
  76. Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting. In International Conference on Machine Learning. PMLR, 8857–8868.
  77. TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models. arXiv preprint arXiv:2403.02221 (2024).
  78. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684–10695.
  79. Pre-training enhanced spatial-temporal graph neural network for multivariate time series forecasting. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1567–1577.
  80. Lifeng Shen and James Kwok. 2023. Non-autoregressive Conditional Diffusion Models for Time Series Prediction. arXiv preprint arXiv:2306.05043 (2023).
  81. Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning. In Advances in Neural Information Processing Systems.
  82. Transfusion: generating long, high fidelity time series using diffusion models with transformers. arXiv preprint arXiv:2307.12667 (2023).
  83. TEST: Text Prototype Aligned Embedding to Activate LLM’s Ability for Time Series. arXiv preprint arXiv:2308.08241 (2023).
  84. Peiwang Tang and Xianchao Zhang. 2022. MTSMAE: Masked Autoencoders for Multivariate Time-Series Forecasting. In 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 982–989.
  85. Attention is all you need. In Advances in Neural Information Processing Systems 30. 5998–6008.
  86. Deep Learning for Multivariate Time Series Imputation: A Survey. arXiv preprint arXiv:2402.04059 (2024).
  87. Where Would I Go Next? Large Language Models as Human Mobility Predictors. arXiv:2308.15197 [cs.AI]
  88. Building Transportation Foundation Model via Generative Graph Transformer. arXiv preprint arXiv:2305.14826 (2023).
  89. An observed value consistent diffusion model for imputing missing values in multivariate time series. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2409–2418.
  90. TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables. arXiv preprint arXiv:2402.19072 (2024).
  91. St-mlp: A cascaded spatio-temporal linear framework with channel-independence strategy for traffic forecasting. arXiv preprint arXiv:2308.07496 (2023).
  92. DiffLoad: uncertainty quantification in load forecasting with diffusion model. arXiv preprint arXiv:2306.01001 (2023).
  93. DiffSTG: Probabilistic spatio-temporal graph forecasting with denoising diffusion models. In the 31st ACM International Conference on Advances in Geographic Information Systems. 1–12.
  94. Robust time series analysis and applications: An industrial perspective. In 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4836–4837.
  95. Transformers in time series: A survey. In International Joint Conference on Artificial Intelligence(IJCAI). 6778–6786.
  96. Christopher Wimmer and Navid Rekabsaz. 2023. Leveraging vision-language models for granular market change prediction. arXiv preprint arXiv:2301.10166 (2023).
  97. Unified training of universal time series forecasting transformers. arXiv preprint arXiv:2402.02592 (2024).
  98. Timesnet: Temporal 2d-variation modeling for general time series analysis. In The eleventh international conference on learning representations.
  99. Deciphering spatio-temporal graph forecasting: A causal lens and treatment. Advances in Neural Information Processing Systems 36 (2024).
  100. The Wall Street Neophyte: A Zero-Shot Analysis of ChatGPT Over MultiModal Stock Movement Prediction Challenges. arXiv preprint arXiv:2304.05351 (2023).
  101. Hao Xue and Flora D Salim. 2022. PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting. arXiv preprint arXiv:2210.08964 (2022).
  102. Leveraging language foundation models for human mobility forecasting. In the 30th International Conference on Advances in Geographic Information Systems. 1–9.
  103. ScoreGrad: Multivariate probabilistic time series forecasting with continuous energy-based generative models. arXiv preprint arXiv:2106.10121 (2021).
  104. Voice2series: Reprogramming acoustic models for time series classification. In International conference on machine learning. PMLR, 11808–11819.
  105. A large language model for electronic health records. NPJ Digital Medicine 5, 1 (2022), 194.
  106. Dcdetector: Dual attention contrastive representation learning for time series anomaly detection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3033–3045.
  107. Toward a foundation model for time series data. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management.
  108. Temporal Data Meets LLM–Explainable Financial Time Series Forecasting. arXiv preprint arXiv:2306.11025 (2023).
  109. UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction. arXiv:2402.11838 [cs.LG]
  110. Spatio-temporal Diffusion Point Processes. arXiv preprint arXiv:2305.12403 (2023).
  111. Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8980–8987.
  112. Imputation as Inpainting: Diffusion models for SpatioTemporal Data Imputation. (2023).
  113. Self-supervised learning for time series analysis: Taxonomy, progress, and prospects. arXiv preprint arXiv:2306.10125 (2023).
  114. Large Language Models for Time Series: A Survey. arXiv:2402.01801 [cs.LG]
  115. Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems 35 ([n. d.]).
  116. CloudRCA: A root cause analysis framework for cloud computing platforms. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4373–4382.
  117. A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).
  118. One Fits All: Power General Time Series Analysis by Pretrained LM. Advances in Neural Information Processing Systems (2023).
  119. Maintaining the Status Quo: Capturing Invariant Relations for OOD Spatiotemporal Learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23). 3603–3614.
  120. Difftraj: Generating gps trajectory with diffusion probabilistic model. Advances in Neural Information Processing Systems 36 (2024).
  121. Energy forecasting with robust, flexible, and explainable machine learning algorithms. AI Magazine 44, 4 (2023), 377–393.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuxuan Liang (126 papers)
  2. Haomin Wen (33 papers)
  3. Yuqi Nie (11 papers)
  4. Yushan Jiang (14 papers)
  5. Ming Jin (130 papers)
  6. Dongjin Song (42 papers)
  7. Shirui Pan (198 papers)
  8. Qingsong Wen (139 papers)
Citations (51)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com