LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters (2308.08469v5)
Abstract: Multivariate time-series forecasting is vital in various domains, e.g., economic planning and weather prediction. Deep train-from-scratch models have exhibited effective performance yet require large amounts of data, which limits real-world applicability. Recently, researchers have leveraged the representation learning transferability of pre-trained LLMs to handle limited non-linguistic datasets effectively. However, incorporating LLMs with time-series data presents challenges of limited adaptation due to different compositions between time-series and linguistic data, and the inability to process multi-scale temporal information. To tackle these challenges, we propose LLM4TS, a framework for time-series forecasting with pre-trained LLMs. LLM4TS consists of a two-stage fine-tuning strategy: the \textit{time-series alignment} stage to align LLMs with the nuances of time-series data, and the \textit{forecasting fine-tuning} stage for downstream time-series forecasting tasks. Furthermore, our framework features a novel two-level aggregation method that integrates multi-scale temporal data within pre-trained LLMs, enhancing their ability to interpret time-specific information. In experiments across 7 time-series forecasting datasets, LLM4TS is superior to existing state-of-the-art methods compared with trained-from-scratch models in full-shot scenarios, and also achieves an average improvement of 6.84% in MSE in few-shot scenarios. In addition, evaluations compared with different self-supervised learning approaches highlight LLM4TS's effectiveness with representation learning in forecasting tasks.
- Timedrl: Disentangled representation learning for multivariate time-series. arXiv preprint arXiv:2312.04142, 2023.
- Time-series representation learning via temporal and contextual contrasting. In International Joint Conference on Artificial Intelligence, 2021.
- Text-to-audio generation using instruction-tuned llm and latent diffusion model. ArXiv, abs/2304.13731, 2023.
- Looped transformers as programmable computers. arXiv preprint arXiv:2301.13196, 2023.
- Tabllm: Few-shot classification of tabular data with large language models. In International Conference on Artificial Intelligence and Statistics, pages 5549–5581. PMLR, 2023.
- Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021.
- Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054, 2022.
- Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pages 95–104, 2018.
- Gated transformer networks for multivariate time series classification. arXiv preprint arXiv:2103.14438, 2021.
- Pretrained transformers as universal computation engines. arXiv preprint arXiv:2103.05247, 1, 2021.
- A time series is worth 64 words: Long-term forecasting with transformers. In ICLR. OpenReview.net, 2023.
- Language models are unsupervised multitask learners. OpenAI blog, 1:9, 2019.
- Unsupervised representation learning for time series with temporal neighborhood coding. In ICLR. OpenReview.net, 2021.
- Llama: Open and efficient foundation language models. ArXiv, abs/2302.13971, 2023.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642, 2021.
- Unsupervised time-series representation learning with iterative bilinear temporal-spectral fusion. In International Conference on Machine Learning, pages 25038–25054. PMLR, 2022.
- Interpretable multi-task learning for product quality prediction with attention mechanism. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1910–1921. IEEE, 2019.
- Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8980–8987, 2022.
- Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023.
- Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems, 35:3988–4003, 2022.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11106–11115, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pages 27268–27286. PMLR, 2022.
- One fits all: Power general time series analysis by pretrained lm. arXiv preprint arXiv:2302.11939, 2023.
- Ching Chang (10 papers)
- Wei-Yao Wang (27 papers)
- Wen-Chih Peng (47 papers)
- Tien-Fu Chen (5 papers)