Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning (2305.10613v3)
Abstract: Temporal knowledge graph (TKG) forecasting benchmarks challenge models to predict future facts using knowledge of past facts. In this paper, we apply LLMs to these benchmarks using in-context learning (ICL). We investigate whether and to what extent LLMs can be used for TKG forecasting, especially without any fine-tuning or explicit modules for capturing structural and temporal information. For our experiments, we present a framework that converts relevant historical facts into prompts and generates ranked predictions using token probabilities. Surprisingly, we observe that LLMs, out-of-the-box, perform on par with state-of-the-art TKG models carefully designed and trained for TKG forecasting. Our extensive evaluation presents performances across several models and datasets with different characteristics, compares alternative heuristics for preparing contextual information, and contrasts to prominent TKG methods and simple frequency and recency baselines. We also discover that using numerical indices instead of entity/relation names, i.e., hiding semantic information, does not significantly affect the performance ($\pm$0.4\% Hit@1). This shows that prior semantic knowledge is unnecessary; instead, LLMs can leverage the existing patterns in the context to achieve such performance. Our analysis also reveals that ICL enables LLMs to learn irregular patterns from the historical context, going beyond simple predictions based on common or recent information.
- Jon Scott Armstrong. 2001. Principles of forecasting: a handbook for researchers and practitioners, volume 30. Springer.
- Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Data distributional properties drive emergent in-context learning in transformers. In Advances in Neural Information Processing Systems.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- Learning sequence encoders for temporal knowledge graph completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4816–4821, Brussels, Belgium. Association for Computational Linguistics.
- What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598.
- On the evaluation of methods for temporal knowledge graph forecasting. In NeurIPS 2022 Temporal Graph Learning Workshop.
- Diachronic embedding for temporal knowledge graph completion. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 3988–3995.
- Michael Hahn and Navin Goyal. 2023. A theory of emergent in-context learning as implicit structure induction. arXiv preprint arXiv:2303.07971.
- Explainable subgraph reasoning for forecasting on temporal knowledge graphs. In International Conference on Learning Representations.
- Learning neural ordinary equations for forecasting future links on temporal knowledge graphs. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8352–8364, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Unsolved problems in ml safety. arXiv preprint arXiv:2109.13916.
- Rob J Hyndman and Yeasmin Khandakar. 2008. Automatic time series forecasting: the forecast package for r. Journal of statistical software, 27:1–22.
- ForecastQA: A question answering challenge for event forecasting with temporal text data. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4636–4650, Online. Association for Computational Linguistics.
- Recurrent event network: Autoregressive structure inferenceover temporal knowledge graphs. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6669–6683, Online. Association for Computational Linguistics.
- Tensor decompositions for temporal knowledge base completion. In International Conference on Learning Representations.
- Julien Leblay and Melisachew Wudage Chekol. 2018. Deriving validity time in knowledge graph. In Companion proceedings of the the web conference 2018, pages 1771–1776.
- Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In International Conference on Learning Representations.
- Temporal knowledge graph reasoning based on evolutional representation learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 408–417.
- Teaching models to express their uncertainty in words. Transactions on Machine Learning Research.
- Tlogic: Temporal logical rules for explainable link forecasting on temporal knowledge graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4120–4127.
- Yago3: A knowledge base from multilingual wikipedias. In 7th biennial conference on innovative data systems research. CIDR Conference.
- Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11048–11064, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 8024–8035.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Impact of pretraining term frequencies on few-shot numerical reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 840–854, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. Advances in neural information processing systems, 32.
- TimeTraveler: Reinforcement learning for temporal knowledge graph forecasting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8306–8319, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Ben Wang. 2021. Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX. https://github.com/kingoflolz/mesh-transformer-jax.
- Richard Webby and Marcus O’Connor. 1996. Judgemental and statistical time series forecasting: a review of the literature. International Journal of forecasting, 12(1):91–118.
- Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- An explanation of in-context learning as implicit bayesian inference. In International Conference on Learning Representations.
- Calibrate before use: Improving few-shot performance of language models. In International Conference on Machine Learning, pages 12697–12706. PMLR.
- Learning from history: Modeling temporal knowledge graphs with sequential copy-generation networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 4732–4740.
- Forecasting future world events with neural networks. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Dong-Ho Lee (30 papers)
- Kian Ahrabian (11 papers)
- Woojeong Jin (17 papers)
- Fred Morstatter (64 papers)
- Jay Pujara (44 papers)