$\textbf{S}^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting (2403.05798v2)
Abstract: Recently, there has been a growing interest in leveraging pre-trained LLMs for various time series applications. However, the semantic space of LLMs, established through the pre-training, is still underexplored and may help yield more distinctive and informative representations to facilitate time series forecasting. To this end, we propose Semantic Space Informed Prompt learning with LLM ($S2$IP-LLM) to align the pre-trained semantic space with time series embeddings space and perform time series forecasting based on learned prompts from the joint space. We first design a tokenization module tailored for cross-modality alignment, which explicitly concatenates patches of decomposed time series components to create embeddings that effectively encode the temporal dynamics. Next, we leverage the pre-trained word token embeddings to derive semantic anchors and align selected anchors with time series embeddings by maximizing the cosine similarity in the joint space. This way, $S2$IP-LLM can retrieve relevant semantic anchors as prompts to provide strong indicators (context) for time series that exhibit different temporal dynamics. With thorough empirical studies on multiple benchmark datasets, we demonstrate that the proposed $S2$IP-LLM can achieve superior forecasting performance over state-of-the-art baselines. Furthermore, our ablation studies and visualizations verify the necessity of prompt learning informed by semantic space.
- Gpt-4 technical report. 2023. URL https://api.semanticscholar.org/CorpusID:257532815.
- Time-series. 2nd edn. The Statistician, 25:308, 1976. URL https://api.semanticscholar.org/CorpusID:134001785.
- Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254, 2021.
- Probabilistic demand forecasting at scale. Proceedings of the VLDB Endowment, 10(12):1694–1705, 2017.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Spectral temporal graph neural network for multivariate time-series forecasting. Advances in neural information processing systems, 33:17766–17778, 2020.
- Tempo: Prompt-based generative pre-trained transformer for time series forecasting. arXiv preprint arXiv:2310.04948, 2023.
- Nhits: Neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 6989–6997, 2023.
- Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:2308.08469, 2023.
- Stl: A seasonal-trend decomposition. J. Off. Stat, 6(1):3–73, 1990.
- Timing of seasonal sales. The Journal of Business, 72(4):545–572, 1999.
- A survey on multimodal large language models for autonomous driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 958–979, 2024.
- Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data. arXiv preprint arXiv:2206.02353, 2022.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Time series analysis of climate variables using seasonal arima approach. Journal of Earth System Science, 129:1–16, 2020.
- Ethayarajh, K. How contextual are contextualized word representations? comparing the geometry of bert, elmo, and gpt-2 embeddings. arXiv preprint arXiv:1909.00512, 2019.
- Transfer learning for time series classification. In 2018 IEEE international conference on big data (Big Data), pp. 1367–1376. IEEE, 2018.
- Friedman, M. The interpolation of time series by related series. Journal of the American Statistical Association, 57(300):729–757, 1962.
- Robusttad: Robust time series anomaly detection via decomposition and convolutional neural networks. arXiv preprint arXiv:2002.09545, 2020.
- Timegpt-1. arXiv preprint arXiv:2310.03589, 2023.
- Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396, 2021.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16000–16009, 2022.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pp. 2790–2799. PMLR, 2019.
- Time-llm: Time series forecasting by reprogramming large language models. In International Conference on Learning Representations, 2024.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021.
- Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pp. 95–104, 2018.
- The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
- From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization. Transfusion, 62(1):87–99, 2022.
- Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926, 2017.
- Large language models in finance: A survey. In Proceedings of the Fourth ACM International Conference on AI in Finance, pp. 374–382, 2023.
- Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE, 2023a.
- Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35:9881–9893, 2022.
- itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023b.
- Frozen pretrained transformers as universal computation engines. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 7628–7636, 2022.
- The m4 competition: Results, findings, conclusion and way forward. International Journal of Forecasting, 34(4):802–808, 2018.
- Large Language Models Are Zero Shot Time Series Forecasters. In Advances in Neural Information Processing Systems, 2023.
- A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations, 2023.
- N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437, 2019.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Improving language understanding by generative pre-training. 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
- Lag-llama: Towards foundation models for time series forecasting. arXiv preprint arXiv:2310.08278, 2023.
- Discrete graph structure learning for forecasting multiple time series. arXiv preprint arXiv:2101.06861, 2021.
- Large language models encode clinical knowledge. arXiv preprint arXiv:2212.13138, 2022.
- Test: Text prototype aligned embedding to activate llm’s ability for time series. arXiv preprint arXiv:2308.08241, 2023.
- Domain adversarial spatial-temporal network: a transferable framework for short-term traffic forecasting across cities. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 1905–1915, 2022.
- Forecasting at scale. The American Statistician, 72(1):37–45, 2018.
- One Fits All: Power general time series analysis by pretrained lm. In NeurIPS, 2023.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023b.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023c.
- Etsformer: Exponential smoothing transformers for time-series forecasting. arXiv preprint arXiv:2202.01381, 2022.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Timesnet: Temporal 2d-variation modeling for general time series analysis. In International Conference on Learning Representations, 2023.
- Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 753–763, 2020.
- Promptcast: A new prompt-based learning paradigm for time series forecasting. 2022.
- A survey on multimodal large language models. arXiv preprint arXiv:2306.13549, 2023.
- Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pp. 11121–11128, 2023.
- Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186, 2022a.
- Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems, 35:3988–4003, 2022b.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp. 11106–11115, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pp. 27268–27286. PMLR, 2022.
- Zijie Pan (14 papers)
- Yushan Jiang (14 papers)
- Sahil Garg (29 papers)
- Anderson Schneider (16 papers)
- Yuriy Nevmyvaka (23 papers)
- Dongjin Song (42 papers)