Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models Are Zero-Shot Time Series Forecasters (2310.07820v3)

Published 11 Oct 2023 in cs.LG

Abstract: By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. Developing this approach, we find that LLMs such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To facilitate this performance, we propose procedures for effectively tokenizing time series data and converting discrete distributions over tokens into highly flexible densities over continuous values. We argue the success of LLMs for time series stems from their ability to naturally represent multimodal distributions, in conjunction with biases for simplicity, and repetition, which align with the salient features in many time series, such as repeated seasonal trends. We also show how LLMs can naturally handle missing data without imputation through non-numerical text, accommodate textual side information, and answer questions to help explain predictions. While we find that increasing model size generally improves performance on time series, we show GPT-4 can perform worse than GPT-3 because of how it tokenizes numbers, and poor uncertainty calibration, which is likely the result of alignment interventions such as RLHF.

Leveraging LLMs for Zero-Shot Time Series Forecasting

Introduction

Time series forecasting presents a unique set of challenges distinct from those encountered in other domains of machine learning, such as audio or video processing. The heterogeneity of time series data sources and the necessity for accurate extrapolation from sparse observations underscore the complexity of developing robust forecasting models. Traditional methods, while sometimes outperforming more complex deep learning approaches, fail to leverage the rich representational power offered by large-scale pretraining. In a novel approach, this paper introduces LLMTime, a method that utilizes LLMs like GPT-3 and LLaMA-2 for zero-shot time series forecasting by framing the forecasting task as a next-token prediction problem. The findings suggest that LLMs can match or exceed the predictive performance of specialized time series models without requiring fine-tuning on specific downstream tasks.

LLMTime Methodology

LLMTime operationalizes time series forecasting through a surprisingly simple yet effective procedure. By encoding time series data as strings of numerical digits and treating forecasting as a text generation task, LLMTime leverages the predilection of LLMs for pattern recognition in sequences. The key innovations of LLMTime include:

  • Effective Encoding: Developing a strategy to convert time series into a string format that facilitates the application of pretrained LLMs for continuous forecasting problems.
  • Adapting Distributions: Modifying discrete output distributions from LLMs into continuous densities, enabling the modeling of complex multimodal distributions inherent in time series data.
  • Probabilistic Capabilities: Exploiting the probabilistic forecasting abilities of LLMs, which naturally align with the features of time series data such as seasonality and missing data handling without explicit imputation.

Empirical Results

Empirical evaluation of LLMTime across multiple datasets confirms its efficacy in zero-shot time series forecasting. Not only does LLMTime demonstrate capability in generating plausible future time series values, but it also achieves superior likelihood and Continuous Ranked Probability Score (CRPS) values compared to traditional forecasting models. Importantly, LLMTime's performance consistently improves with the scale of the LLM, indicating a promising trajectory for future enhancements with more advanced models. However, noteworthy is the observation that certain alignment interventions, such as Reinforcement Learning from Human Feedback (RLHF), might adversely affect model performance, especially in uncertainty calibration.

Theoretical Insights and Practical Implications

The paper explores the underlying reasons for LLMTime's success. It attributes the efficacy of LLMs in time series forecasting to their compression patterns and preferences for simplicity and repetition, which mirror the structural characteristics of time series data. LLMs' inherent capacity to handle missing data, accommodate textual side information, and generate explanations for predictions presents a significant advancement over traditional methods. These capabilities suggest a broader applicability of LLMs beyond natural language tasks and offer a compelling argument for their use in addressing complex time series forecasting challenges.

Future Directions

Looking ahead, the paper points to several avenues for future research. These include exploring methods to extend LLMs' context windows for handling more extensive time series, improving their arithmetic and recursive operation capabilities, and developing effective fine-tuning procedures for LLMs on time series data. The potential integration of LLMs into time series forecasting opens up promising prospects for improved model performance and functionality.

Conclusion

In conclusion, this paper establishes LLMTime as a groundbreaking method that harnesses the generalization capabilities of LLMs for zero-shot time series forecasting. By intelligently bridging the gap between text sequence modeling and time series prediction, LLMTime paves the way for leveraging the advancements in natural language processing to address the intricate challenges of time series forecasting, marking a significant step towards the unification of model capabilities across diverse domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Colt5: Faster long-range transformers with conditional computation. arXiv preprint arXiv:2303.09752, 2023.
  2. Physics of language models: Part 1, context-free grammar. arXiv preprint arXiv:2305.13673, 2023.
  3. Path independent equilibrium models can better exploit test-time computation. Advances in Neural Information Processing Systems, 35:7796–7809, 2022.
  4. Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
  5. Anthropic. Introducing 100k context windows. Anthropic blog, 2023. URL https://www.anthropic.com/index/100k-context-windows.
  6. Deep probabilistic time series forecasting over long horizons. openreview preprint, 2022. URL https://openreview.net/forum?id=22h1XSEiN0.
  7. Emergent and predictable memorization in large language models. arXiv preprint arXiv:2304.11158, 2023.
  8. Some recent advances in forecasting and control. Journal of the Royal Statistical Society. Series C (Applied Statistics), 17(2):91–109, 1968.
  9. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  10. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
  11. N-hits: Neural hierarchical interpolation for time series forecasting. arXiv preprint arXiv:2201.12886, 2022.
  12. Speak, memory: An archaeology of books known to chatgpt/gpt-4. arXiv preprint arXiv:2305.00118, 2023.
  13. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
  14. Miles Cranmer. Interpretable machine learning for science with pysr and symbolicregression. jl. arXiv preprint arXiv:2305.01582, 2023.
  15. Language modeling is compression. arXiv preprint arXiv:2309.10668, 2023.
  16. Preformer: Predictive transformer with multi-scale segment-wise correlations for long-term time series forecasting. arXiv preprint arXiv:2202.11356, 2022.
  17. Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. In Advances in Neural Information Processing Systems, 2018.
  18. Monash time series forecasting archive. arXiv preprint arXiv:2105.06643, 2021.
  19. The no free lunch theorem, kolmogorov complexity, and the role of inductive biases in machine learning. arXiv preprint arXiv:2304.05366, 2023.
  20. Deep learning. MIT press, 2016.
  21. The lie derivative for measuring learned equivariance. arXiv preprint arXiv:2210.02984, 2022.
  22. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
  23. Darts: User-friendly modern machine learning for time series. The Journal of Machine Learning Research, 23(1):5442–5447, 2022.
  24. Forecast evaluation for data scientists: common pitfalls and best practices. Data Mining and Knowledge Discovery, 37(2):788–832, 2023.
  25. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
  26. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media, 2008.
  27. Ai in healthcare: time-series forecasting using statistical, neural, and ensemble architectures. Frontiers in big data, 3:4, 2020.
  28. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, pages 47–54. Springer, 2016.
  29. Deduplicating training data makes language models better. arXiv preprint arXiv:2107.06499, 2021.
  30. Tiedong Liu and Bryan Kian Hsiang Low. Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks. arXiv preprint arXiv:2305.14201, 2023.
  31. A scalable hierarchical distributed language model. Advances in neural information processing systems, 21, 2008.
  32. imputets: time series missing value imputation in r. R J., 9(1):207, 2017.
  33. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114, 2021.
  34. In-context learning and induction heads. Transformer Circuits Thread, 2022. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html.
  35. Wavenet: A generative model for raw audio. In 9th ISCA Speech Synthesis Workshop, pages 125–125. ISCA, 2016.
  36. OpenAI. Gpt-4 technical report. arXiv, 2023.
  37. N-beats: Neural basis expansion analysis for interpretable time series forecasting. Journal of Machine Learning Research, 21(111):1–63, 2020.
  38. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  39. Catboost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems, volume 31, pages 6638–6648. NeurIPS, 2018.
  40. Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3):1181–1191, 2020.
  41. Can you learn an algorithm? generalizing from easy to hard problems with recurrent networks. Advances in Neural Information Processing Systems, 34:6695–6706, 2021.
  42. Ilya Sutskever. An observation on generalization. Workshop on Large Language Models and Transformers, 2023. URL https://www.youtube.com/watch?v=AKMuA_TVz3A&ab_channel=SimonsInstitute.
  43. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
  44. Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288, 2023b.
  45. On the identification of sales forecasting models in the presence of promotions. Journal of the operational Research Society, 66(2):299–307, 2015.
  46. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, 2021.
  47. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
  48. Gaussian process kernels for pattern discovery and extrapolation. In International conference on machine learning, pages 1067–1075. PMLR, 2013.
  49. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34, 2021.
  50. Promptcast: A new prompt-based learning paradigm for time series forecasting, 2023.
  51. How well do large language models perform in arithmetic tasks? arXiv preprint arXiv:2304.02015, 2023.
  52. Are transformers effective for time series forecasting? arXiv preprint arXiv:2205.13504, 2022.
  53. Meta-transformer: A unified framework for multimodal learning. arXiv preprint arXiv:2307.10802, 2023.
  54. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of AAAI, 2021.
  55. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proc. 39th International Conference on Machine Learning (ICML 2022), 2022.
  56. One fits all: Power general time series analysis by pretrained lm. arXiv preprint arXiv:2302.11939, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Nate Gruver (13 papers)
  2. Marc Finzi (25 papers)
  3. Shikai Qiu (9 papers)
  4. Andrew Gordon Wilson (133 papers)
Citations (214)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub

Youtube Logo Streamline Icon: https://streamlinehq.com