More than Correlation: Do Large Language Models Learn Causal Representations of Space? (2312.16257v1)
Abstract: Recent work found high mutual information between the learned representations of LLMs and the geospatial property of its input, hinting an emergent internal model of space. However, whether this internal space model has any causal effects on the LLMs' behaviors was not answered by that work, led to criticism of these findings as mere statistical correlation. Our study focused on uncovering the causality of the spatial representations in LLMs. In particular, we discovered the potential spatial representations in DeBERTa, GPT-Neo using representational similarity analysis and linear and non-linear probing. Our casual intervention experiments showed that the spatial representations influenced the model's performance on next word prediction and a downstream task that relies on geospatial information. Our experiments suggested that the LLMs learn and use an internal model of space in solving geospatial related tasks.
- Can language models encode perceptual structure without grounding? a case study in color. arXiv preprint arXiv:2109.06129.
- Guillaume Alain and Yoshua Bengio. 2016. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644.
- Emily M Bender and Alexander Koller. 2020. Climbing towards nlu: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 5185–5198.
- Yoshua Bengio. 2007. Learning deep architectures for ai. Found. Trends Mach. Learn., 2:1–127.
- Unsupervised feature learning and deep learning: A review and new perspectives. CoRR, abs/1206.5538.
- Experience grounds language. arXiv preprint arXiv:2004.10151.
- GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Understanding spatial relations through multiple modalities. arXiv preprint arXiv:2007.09551.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Amnesic probing: Behavioral explanation with amnesic counterfactuals. Transactions of the Association for Computational Linguistics, 9:160–175.
- Causalm: Causal model explanation through counterfactual language models. Computational Linguistics, 47(2):333–386.
- Kenneth Gade. 2010. A Non-singular Horizontal Position Representation. Journal of Navigation, 63(3):395–417.
- Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information. arXiv preprint arXiv:1808.08079.
- Wes Gurnee and Max Tegmark. 2023. Language models represent space and time. arXiv preprint arXiv:2310.02207.
- Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654.
- Linearity of relation decoding in transformer language models. arXiv preprint arXiv:2308.09124.
- C.F.F. Karney and R.E. Deakin. 2010. F.w. bessel (1825): The calculation of longitude and latitude from geodesic measurements. Astronomische Nachrichten, 331(8):852–861.
- Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2.
- Yann LeCun. 2022. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review, 62.
- Implicit representations of meaning in neural language models. arXiv preprint arXiv:2106.00737.
- Analyzing microarray gene expression data.
- Provable limitations of acquiring meaning from ungrounded form: What will future language models understand? Transactions of the Association for Computational Linguistics, 9:1047–1060.
- OpenAI. 2023. Gpt-4 technical report.
- Language models as knowledge bases? arXiv preprint arXiv:1909.01066.
- Information-theoretic probing for linguistic structure. arXiv preprint arXiv:2004.03061.
- Improving language understanding by generative pre-training.
- Language models are unsupervised multitask learners.
- Probing the probing paradigm: Does probing accuracy entail task relevance? CoRR, abs/2005.00719.
- David Rolnick and Max Tegmark. 2017. The power of deeper networks for expressing natural functions. ArXiv, abs/1705.05502.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
- Bert-based spatial information extraction. In Proceedings of the Third International Workshop on Spatial Language Understanding, pages 10–17.
- M. Stone. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36(2):111–133.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- What if this modified that? syntactic interventions via counterfactual embeddings. arXiv preprint arXiv:2105.14002.
- Yida Chen (8 papers)
- Yixian Gan (1 paper)
- Sijia Li (33 papers)
- Li Yao (27 papers)
- Xiaohan Zhao (10 papers)