Do Large Language Models Latently Perform Multi-Hop Reasoning?
Abstract: We study whether LLMs latently perform multi-hop reasoning with complex prompts such as "The mother of the singer of 'Superstition' is". We look for evidence of a latent reasoning pathway where an LLM (1) latently identifies "the singer of 'Superstition'" as Stevie Wonder, the bridge entity, and (2) uses its knowledge of Stevie Wonder's mother to complete the prompt. We analyze these two hops individually and consider their co-occurrence as indicative of latent multi-hop reasoning. For the first hop, we test if changing the prompt to indirectly mention the bridge entity instead of any other entity increases the LLM's internal recall of the bridge entity. For the second hop, we test if increasing this recall causes the LLM to better utilize what it knows about the bridge entity. We find strong evidence of latent multi-hop reasoning for the prompts of certain relation types, with the reasoning pathway used in more than 80% of the prompts. However, the utilization is highly contextual, varying across different types of prompts. Also, on average, the evidence for the second hop and the full multi-hop traversal is rather moderate and only substantial for the first hop. Moreover, we find a clear scaling trend with increasing model size for the first hop of reasoning but not for the second hop. Our experimental findings suggest potential challenges and opportunities for future development and applications of LLMs.
- What learning algorithm is in-context learning? investigations with linear models. In ICLR.
- Zeyuan Allen-Zhu and Yuanzhi Li. 2023. Physics of language models: Part 3.2, knowledge manipulation. arXiv.
- Akari Asai and Hannaneh Hajishirzi. 2020. Logic-guided data augmentation and regularization for consistent question answering. In ACL.
- Eliciting latent predictions from transformers with the tuned lens. arXiv.
- The reversal curse: LLMs trained on “a is b” fail to learn “b is a”. In ICLR.
- Language models are few-shot learners. In NeurIPS.
- Data distributional properties drive emergent in-context learning in transformers. In NeurIPS.
- Identifying linear relational concepts in large language models. arXiv.
- Evaluating the ripple effects of knowledge editing in language models. arXiv.
- Towards automated circuit discovery for mechanistic interpretability. In NeurIPS.
- Why can GPT learn in-context? language models secretly perform gradient descent as meta-optimizers. In Findings of ACL.
- Editing factual knowledge in language models. In EMNLP.
- Jump to conclusions: Short-cutting transformers with linear transformations. arXiv.
- Faith and fate: Limits of transformers on compositionality. In NeurIPS.
- Measuring and improving consistency in pretrained language models. TACL.
- Jiahai Feng and Jacob Steinhardt. 2024. How do language models bind entities in context? In ICLR.
- Dissecting recall of factual associations in auto-regressive language models. In EMNLP.
- Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. In EMNLP.
- Transformer feed-forward layers are key-value memories. In EMNLP.
- Linearity of relation decoding in transformer language models. In ICLR.
- Towards a mechanistic interpretation of multi-step reasoning capabilities of language models. In ACL.
- Know how to make up your mind! adversarially detecting and alleviating inconsistencies in natural language explanations. In ACL.
- Language models with rationality. In EMNLP.
- BeliefBank: Adding memory to a pre-trained language model for a systematic notion of belief. In EMNLP.
- Transformer language models handle word frequency in prediction head. In ACL.
- A logic-driven framework for consistency of neural models. In EMNLP.
- Does circuit analysis interpretability scale? evidence from multiple choice capabilities in chinchilla. arXiv.
- The hydra effect: Emergent self-repair in language model computations. arXiv.
- Locating and editing factual associations in GPT. In NeurIPS.
- Fast model editing at scale. In ICLR.
- Neel Nanda and Joseph Bloom. 2022. Transformerlens. https://github.com/neelnanda-io/TransformerLens.
- Progress measures for grokking via mechanistic interpretability. In ICLR.
- nostalgebraist. 2020. interpreting gpt: the logit lens.
- Measuring and narrowing the compositionality gap in language models. In Findings of EMNLP.
- In-context learning and induction heads. arXiv.
- Can LMs learn new entities from descriptions? challenges in propagating injected knowledge. In ACL.
- Gpt-4 technical report. arXiv.
- Language models as knowledge bases? In EMNLP.
- Ben Prystawski and Noah D Goodman. 2023. Why think step-by-step? reasoning emerges from the locality of experience. In NeurIPS.
- Are red roses red? evaluating consistency of question-answering models. In ACL.
- Memory injections: Correcting multi-hop reasoning failures during inference in transformer-based language models. arXiv.
- Testing the general deductive reasoning capacity of large language models using OOD examples. In NeurIPS.
- William Timkey and Marten van Schijndel. 2021. All bard and no bite: Rogue dimensions in transformer language models obscure representational quality. In EMNLP.
- Llama 2: Open foundation and fine-tuned chat models. arXiv.
- Attention is all you need. In NeurIPS.
- Transformers learn in-context by gradient descent. In ICML.
- Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM.
- Interpretability in the wild: a circuit for indirect object identification in GPT-2 small. In ICLR.
- Emergent abilities of large language models. TMLR.
- Chain of thought prompting elicits reasoning in large language models. In NeurIPS.
- Constructing datasets for multi-hop reading comprehension across documents. TACL.
- Huggingface’s transformers: State-of-the-art natural language processing. arXiv.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In EMNLP.
- A comprehensive study of knowledge editing for large language models. arXiv.
- MQAKE: Assessing knowledge editing in language models via multi-hop questions. In EMNLP.
- Least-to-most prompting enables complex reasoning in large language models. In ICLR.
- Zeyuan Allen Zhu and Yuanzhi Li. 2023. Physics of language models: Part 3.1, knowledge storage and extraction. arXiv.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.