Future Language Modeling from Temporal Document History (2404.10297v1)
Abstract: Predicting the future is of great interest across many aspects of human activity. Businesses are interested in future trends, traders are interested in future stock prices, and companies are highly interested in future technological breakthroughs. While there are many automated systems for predicting future numerical data, such as weather, stock prices, and demand for products, there is relatively little work in automatically predicting textual data. Humans are interested in textual data predictions because it is a natural format for our consumption, and experts routinely make predictions in a textual format (Christensen et al., 2004; Tetlock & Gardner, 2015; Frick, 2015). However, there has been relatively little formalization of this general problem in the machine learning or natural language processing communities. To address this gap, we introduce the task of future LLMing: probabilistic modeling of texts in the future based on a temporal history of texts. To our knowledge, our work is the first work to formalize the task of predicting the future in this way. We show that it is indeed possible to build future LLMs that improve upon strong non-temporal LLM baselines, opening the door to working on this important, and widely applicable problem.
- Kaushik Acharya. WNUT 2020 shared task-1: Conditional random field(CRF) based named entity recognition(NER) for wet lab protocols. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pp. 286–289, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.wnut-1.37. URL https://aclanthology.org/2020.wnut-1.37.
- Temporal Effects on Pre-trained Models for Language Processing Tasks. Transactions of the Association for Computational Linguistics, 10:904–921, 09 2022. ISSN 2307-387X. doi: 10.1162/tacl˙a˙00497. URL https://doi.org/10.1162/tacl_a_00497.
- BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evaluation in Bangla. In Findings of the Association for Computational Linguistics: NAACL 2022, pp. 1318–1327, Seattle, United States, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.findings-naacl.98. URL https://aclanthology.org/2022.findings-naacl.98.
- XLM-E: Cross-lingual language model pre-training via ELECTRA. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6170–6182, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.427. URL https://aclanthology.org/2022.acl-long.427.
- Seeing What’s Next: Using Theories of Innovation to Predict Industry Change. Harvard Business Press, 2004.
- Time-Aware Language Models as Temporal Knowledge Bases. Transactions of the Association for Computational Linguistics, 10:257–273, 03 2022. ISSN 2307-387X. doi: 10.1162/tacl˙a˙00459. URL https://doi.org/10.1162/tacl_a_00459.
- The statistical sign test. Journal of the American Statistical Association, 41(236):557–566, 1946. ISSN 01621459. URL http://www.jstor.org/stable/2280577.
- GLM: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 320–335, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.26. URL https://aclanthology.org/2022.acl-long.26.
- Walter Frick. What research tells us about making accurate predictions. Harvard Business Review, 2015. URL https://hbr.org/2015/02/what-research-tells-us-about-making-accurate-predictions.
- Dynamic contextualized word embeddings. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6970–6984, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.542. URL https://aclanthology.org/2021.acl-long.542.
- Neural temporality adaptation for document classification: Diachronic word embeddings and domain adaptation models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4113–4123, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1403. URL https://aclanthology.org/P19-1403.
- Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
- Mind the gap: Assessing temporal generalization in neural language models. Advances in Neural Information Processing Systems, 34:29348–29363, 2021.
- Neural language modeling for named entity recognition. In Proceedings of the 28th International Conference on Computational Linguistics, pp. 6937–6941, Barcelona, Spain (Online), December 2020. International Committee on Computational Linguistics. doi: 10.18653/v1/2020.coling-main.612. URL https://aclanthology.org/2020.coling-main.612.
- Roberta: A robustly optimized bert pretraining approach. ArXiv, abs/1907.11692, 2019.
- TimeLMs: Diachronic language models from Twitter. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 251–260, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-demo.25. URL https://aclanthology.org/2022.acl-demo.25.
- Fuzzy logic for vagueness management in referring expression generation. In Proceedings of the Workshop on Intelligent Information Processing and Natural Language Generation, pp. 71–76, Santiago de Compostela, Spain, September 2020. Association for Computational Lingustics. URL https://aclanthology.org/2020.intellang-1.8.
- WORD SENSE DISAMBIUATION FOR KASHMIRI LANGUAGE USING SUPERVISED MACHINE LEARNING. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pp. 243–245, Indian Institute of Technology Patna, Patna, India, December 2020. NLP Association of India (NLPAI). URL https://aclanthology.org/2020.icon-main.32.
- Language models are unsupervised multitask learners. 2019.
- Time masking for temporal language models. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, WSDM ’22, pp. 833–841, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450391320. doi: 10.1145/3488560.3498529. URL https://doi.org/10.1145/3488560.3498529.
- Temporal adaptation of BERT and performance on downstream document classification: Insights from social media. In Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2400–2412, Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.findings-emnlp.206. URL https://aclanthology.org/2021.findings-emnlp.206.
- Superforecasting: The Art and Science of Prediction. Crown Publishers, 2015.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
Collections
Sign up for free to add this paper to one or more collections.