Large Language Models Are Zero-Shot Time Series Forecasters (2310.07820v3)

Published 11 Oct 2023 in cs.LG

Abstract: By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. Developing this approach, we find that LLMs such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To facilitate this performance, we propose procedures for effectively tokenizing time series data and converting discrete distributions over tokens into highly flexible densities over continuous values. We argue the success of LLMs for time series stems from their ability to naturally represent multimodal distributions, in conjunction with biases for simplicity, and repetition, which align with the salient features in many time series, such as repeated seasonal trends. We also show how LLMs can naturally handle missing data without imputation through non-numerical text, accommodate textual side information, and answer questions to help explain predictions. While we find that increasing model size generally improves performance on time series, we show GPT-4 can perform worse than GPT-3 because of how it tokenizes numbers, and poor uncertainty calibration, which is likely the result of alignment interventions such as RLHF.

PDF HTML Abstract

Leveraging LLMs for Zero-Shot Time Series Forecasting

Introduction

Time series forecasting presents a unique set of challenges distinct from those encountered in other domains of machine learning, such as audio or video processing. The heterogeneity of time series data sources and the necessity for accurate extrapolation from sparse observations underscore the complexity of developing robust forecasting models. Traditional methods, while sometimes outperforming more complex deep learning approaches, fail to leverage the rich representational power offered by large-scale pretraining. In a novel approach, this paper introduces LLMTime, a method that utilizes LLMs like GPT-3 and LLaMA-2 for zero-shot time series forecasting by framing the forecasting task as a next-token prediction problem. The findings suggest that LLMs can match or exceed the predictive performance of specialized time series models without requiring fine-tuning on specific downstream tasks.

LLMTime Methodology

LLMTime operationalizes time series forecasting through a surprisingly simple yet effective procedure. By encoding time series data as strings of numerical digits and treating forecasting as a text generation task, LLMTime leverages the predilection of LLMs for pattern recognition in sequences. The key innovations of LLMTime include:

Effective Encoding: Developing a strategy to convert time series into a string format that facilitates the application of pretrained LLMs for continuous forecasting problems.
Adapting Distributions: Modifying discrete output distributions from LLMs into continuous densities, enabling the modeling of complex multimodal distributions inherent in time series data.
Probabilistic Capabilities: Exploiting the probabilistic forecasting abilities of LLMs, which naturally align with the features of time series data such as seasonality and missing data handling without explicit imputation.

Empirical Results

Empirical evaluation of LLMTime across multiple datasets confirms its efficacy in zero-shot time series forecasting. Not only does LLMTime demonstrate capability in generating plausible future time series values, but it also achieves superior likelihood and Continuous Ranked Probability Score (CRPS) values compared to traditional forecasting models. Importantly, LLMTime's performance consistently improves with the scale of the LLM, indicating a promising trajectory for future enhancements with more advanced models. However, noteworthy is the observation that certain alignment interventions, such as Reinforcement Learning from Human Feedback (RLHF), might adversely affect model performance, especially in uncertainty calibration.

Theoretical Insights and Practical Implications

The paper explores the underlying reasons for LLMTime's success. It attributes the efficacy of LLMs in time series forecasting to their compression patterns and preferences for simplicity and repetition, which mirror the structural characteristics of time series data. LLMs' inherent capacity to handle missing data, accommodate textual side information, and generate explanations for predictions presents a significant advancement over traditional methods. These capabilities suggest a broader applicability of LLMs beyond natural language tasks and offer a compelling argument for their use in addressing complex time series forecasting challenges.

Future Directions

Looking ahead, the paper points to several avenues for future research. These include exploring methods to extend LLMs' context windows for handling more extensive time series, improving their arithmetic and recursive operation capabilities, and developing effective fine-tuning procedures for LLMs on time series data. The potential integration of LLMs into time series forecasting opens up promising prospects for improved model performance and functionality.

Conclusion

In conclusion, this paper establishes LLMTime as a groundbreaking method that harnesses the generalization capabilities of LLMs for zero-shot time series forecasting. By intelligently bridging the gap between text sequence modeling and time series prediction, LLMTime paves the way for leveraging the advancements in natural language processing to address the intricate challenges of time series forecasting, marking a significant step towards the unification of model capabilities across diverse domains.