Language Models Represent Space and Time (2310.02207v3)
Abstract: The capabilities of LLMs have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a set of more coherent and grounded representations that reflect the real world. We find evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual "space neurons" and "time neurons" that reliably encode spatial and temporal coordinates. While further investigation is needed, our results suggest modern LLMs learn rich spatiotemporal representations of the real world and possess basic ingredients of a world model.
- Can language models encode perceptual structure without grounding? a case study in color. arXiv preprint arXiv:2109.06129, 2021.
- Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016.
- Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people. International AAAI Conference on Web and Social Media (ICWSM), 16, 2022.
- Jack Bandy. Three decades of new york times headlines, 2021. URL https://www.kaggle.com/datasets/johnbandy/new-york-times-headlines. Kaggle dataset.
- Yonatan Belinkov. Probing classifiers: Promises, shortcomings, and advances. Computational Linguistics, 48(1):207–219, 2022.
- Climbing towards nlu: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics, pp. 5185–5198, 2020.
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pp. 610–623, 2021.
- Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pp. 2397–2430. PMLR, 2023.
- Experience grounds language. arXiv preprint arXiv:2004.10151, 2020.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
- Discovering latent knowledge in language models without supervision. arXiv preprint arXiv:2212.03827, 2022.
- Space and time in the brain. Science, 358(6362):482–485, 2017.
- Sparse autoencoders find highly interpretable features in language models. arXiv preprint arXiv:2309.08600, 2023.
- Softmax linear units. Transformer Circuits Thread, 2022a. https://transformer-circuits.pub/2022/solu/index.html.
- Toy models of superposition. arXiv preprint arXiv:2209.10652, 2022b.
- Dissecting recall of factual associations in auto-regressive language models. arXiv preprint arXiv:2304.14767, 2023.
- Multimodal neurons in artificial neural networks. Distill, 6(3):e30, 2021.
- Distributional vectors encode referential attributes. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 12–21, 2015.
- Finding neurons in a haystack: Case studies with sparse probing. arXiv preprint arXiv:2305.01610, 2023.
- Microstructure of a spatial map in the entorhinal cortex. Nature, 436(7052):801–806, 2005.
- How does gpt-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model. arXiv preprint arXiv:2305.00586, 2023.
- The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
- An overview of catastrophic ai risks. arXiv preprint arXiv:2306.12001, 2023.
- Linearity of relation decoding in transformer language models. arXiv preprint arXiv:2308.09124, 2023.
- Geographical evaluation of word embeddings. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 224–232, 2017.
- Dbpedia - a large-scale, multilingual knowledge base extracted from wikipedia, 2015. URL http://dbpedia.org. Version 2023.
- Implicit representations of meaning in neural language models. arXiv preprint arXiv:2106.00737, 2021.
- Emergent world representations: Exploring a sequence model trained on a synthetic task. arXiv preprint arXiv:2210.13382, 2022.
- Do language models know the way to rome? arXiv preprint arXiv:2109.07971, 2021.
- Probing across time: What does roberta know and when? arXiv preprint arXiv:2104.07885, 2021.
- Representing spatial structure through maps and language: Lord of the rings encodes the spatial structure of middle earth. Cognitive science, 36(8):1556–1569, 2012.
- Language encodes geographical information. Cognitive Science, 33(1):51–73, 2009.
- Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372, 2022a.
- Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229, 2022b.
- The quantization model of neural scaling. arXiv preprint arXiv:2303.13506, 2023.
- Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013a.
- Linguistic regularities in continuous space word representations. In Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: Human language technologies, pp. 746–751, 2013b.
- Emergent linear representations in world models of self-supervised sequence models. arXiv preprint arXiv:2309.00941, 2023.
- The alignment problem from a deep learning perspective, 2023.
- NYC OpenData. Points of interest, 2023. URL https://data.cityofnewyork.us/City-Government/Points-Of-Interest/rxuy-2muj. Accessed: 2023-07-01.
- The hippocampus as a spatial map: preliminary evidence from unit activity in the freely-moving rat. Brain research, 1971.
- Zoom in: An introduction to circuits. Distill, 5(3):e00024–001, 2020.
- Mapping language models to grounded conceptual spaces. In International Conference on Learning Representations, 2021.
- Toward transparent ai: A survey on interpreting the inner structures of deep neural networks. In 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pp. 464–483. IEEE, 2023.
- Probing the probing paradigm: Does probing accuracy entail task relevance? arXiv preprint arXiv:2005.00719, 2020.
- A primer in bertology: What we know about how bert works. Transactions of the Association for Computational Linguistics, 8:842–866, 2021.
- A neural code for time and space in the human brain. Cell Reports, 42(11), 2023.
- Chess as a testbed for language model state tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 11385–11393, 2022.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
- Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 214–229, 2022.
- Wes Gurnee (12 papers)
- Max Tegmark (133 papers)