Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters (2308.08469v5)

Published 16 Aug 2023 in cs.LG

Abstract: Multivariate time-series forecasting is vital in various domains, e.g., economic planning and weather prediction. Deep train-from-scratch models have exhibited effective performance yet require large amounts of data, which limits real-world applicability. Recently, researchers have leveraged the representation learning transferability of pre-trained LLMs to handle limited non-linguistic datasets effectively. However, incorporating LLMs with time-series data presents challenges of limited adaptation due to different compositions between time-series and linguistic data, and the inability to process multi-scale temporal information. To tackle these challenges, we propose LLM4TS, a framework for time-series forecasting with pre-trained LLMs. LLM4TS consists of a two-stage fine-tuning strategy: the \textit{time-series alignment} stage to align LLMs with the nuances of time-series data, and the \textit{forecasting fine-tuning} stage for downstream time-series forecasting tasks. Furthermore, our framework features a novel two-level aggregation method that integrates multi-scale temporal data within pre-trained LLMs, enhancing their ability to interpret time-specific information. In experiments across 7 time-series forecasting datasets, LLM4TS is superior to existing state-of-the-art methods compared with trained-from-scratch models in full-shot scenarios, and also achieves an average improvement of 6.84% in MSE in few-shot scenarios. In addition, evaluations compared with different self-supervised learning approaches highlight LLM4TS's effectiveness with representation learning in forecasting tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Timedrl: Disentangled representation learning for multivariate time-series. arXiv preprint arXiv:2312.04142, 2023.
  2. Time-series representation learning via temporal and contextual contrasting. In International Joint Conference on Artificial Intelligence, 2021.
  3. Text-to-audio generation using instruction-tuned llm and latent diffusion model. ArXiv, abs/2304.13731, 2023.
  4. Looped transformers as programmable computers. arXiv preprint arXiv:2301.13196, 2023.
  5. Tabllm: Few-shot classification of tabular data with large language models. In International Conference on Artificial Intelligence and Statistics, pages 5549–5581. PMLR, 2023.
  6. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022.
  7. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  8. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021.
  9. Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054, 2022.
  10. Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pages 95–104, 2018.
  11. Gated transformer networks for multivariate time series classification. arXiv preprint arXiv:2103.14438, 2021.
  12. Pretrained transformers as universal computation engines. arXiv preprint arXiv:2103.05247, 1, 2021.
  13. A time series is worth 64 words: Long-term forecasting with transformers. In ICLR. OpenReview.net, 2023.
  14. Language models are unsupervised multitask learners. OpenAI blog, 1:9, 2019.
  15. Unsupervised representation learning for time series with temporal neighborhood coding. In ICLR. OpenReview.net, 2021.
  16. Llama: Open and efficient foundation language models. ArXiv, abs/2302.13971, 2023.
  17. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
  18. Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642, 2021.
  19. Unsupervised time-series representation learning with iterative bilinear temporal-spectral fusion. In International Conference on Machine Learning, pages 25038–25054. PMLR, 2022.
  20. Interpretable multi-task learning for product quality prediction with attention mechanism. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1910–1921. IEEE, 2019.
  21. Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8980–8987, 2022.
  22. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023.
  23. Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems, 35:3988–4003, 2022.
  24. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11106–11115, 2021.
  25. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pages 27268–27286. PMLR, 2022.
  26. One fits all: Power general time series analysis by pretrained lm. arXiv preprint arXiv:2302.11939, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ching Chang (10 papers)
  2. Wei-Yao Wang (27 papers)
  3. Wen-Chih Peng (47 papers)
  4. Tien-Fu Chen (5 papers)
Citations (24)
Youtube Logo Streamline Icon: https://streamlinehq.com