Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models (2309.05605v3)
Abstract: Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. LLMs struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%.
- Konstantine Arkoudas. 2023. GPT-4 can’t reason. arXiv preprint arXiv:2308.03762.
- Eliciting latent predictions from transformers with the tuned lens. arXiv preprint arXiv:2303.08112.
- Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687.
- Can GPT-3 perform statutory reasoning? arXiv preprint arXiv:2302.06100.
- Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901.
- Evaluating the ripple effects of knowledge editing in language models. arXiv preprint arXiv:2307.12976.
- Knowledge neurons in pretrained transformers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8493–8502.
- Neural knowledge bank for pretrained transformers. arXiv preprint arXiv:2208.00399.
- Analyzing transformers in embedding space. arXiv preprint arXiv:2209.02535.
- Mark Davies. 2010. The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing, 25(4):447–464.
- Mark Davies. 2011. Word frequency data from the Corpus of Contemporary American English (COCA).
- Editing factual knowledge in language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6491–6506.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
- Time-aware language models as temporal knowledge bases. Transactions of the Association for Computational Linguistics, 10:257–273.
- Measuring and improving consistency in pretrained language models. Transactions of the Association for Computational Linguistics, 9:1012–1031.
- A mathematical framework for transformer circuits.
- Dissecting recall of factual associations in auto-regressive language models. arXiv preprint arXiv:2304.14767.
- Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 30–45, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Transformer feed-forward layers are key-value memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5484–5495, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- What indeed can GPT models do in chemistry? a comprehensive benchmark on eight tasks. arXiv preprint arXiv:2305.18365.
- Does localization inform editing? Surprising differences in causality-based localization vs. knowledge editing in language models. arXiv preprint arXiv:2301.04213.
- Do language models have beliefs? Methods for detecting, updating, and visualizing model beliefs. arXiv preprint arXiv:2111.13654.
- Constructing a multi-hop QA dataset for comprehensive evaluation of reasoning steps. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6609–6625, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Transformer-patcher: One mistake worth one neuron. In The Eleventh International Conference on Learning Representations.
- Towards continual knowledge learning of language models. In International Conference on Learning Representations.
- UniKGQA: Unified retrieval and reasoning for solving multi-hop question answering over knowledge graph. In The Eleventh International Conference on Learning Representations.
- PMET: Precise model editing in a transformer. arXiv preprint arXiv:2308.08742.
- Jieyi Long. 2023. Large language model guided tree-of-thought. arXiv preprint arXiv:2305.08291.
- Locating and editing factual associations in GPT. Advances in Neural Information Processing Systems, 35:17359–17372.
- Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229.
- Fast model editing at scale. In International Conference on Learning Representations.
- Memory-based model editing at scale. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 15817–15831. PMLR.
- Neel Nanda and Joseph Bloom. 2022. TransformerLens.
- Progress measures for grokking via mechanistic interpretability. In The Eleventh International Conference on Learning Representations.
- nostalgebraist. 2021. Logit Lens on non-GPT2 models + extensions.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Open domain question answering using early fusion of knowledge bases and text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4231–4242, Brussels, Belgium. Association for Computational Linguistics.
- Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. arXiv preprint arXiv:2305.04388.
- Attention is all you need. Advances in Neural Information Processing Systems, 30.
- Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- Decomposition enhances reasoning via self-evaluation guided decoding. arXiv preprint arXiv:2305.00633.
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
- Subgraph retrieval enhanced model for multi-hop knowledge base question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5773–5784, Dublin, Ireland. Association for Computational Linguistics.
- MQuAKE: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795.
- Mansi Sakarvadia (10 papers)
- Aswathy Ajith (8 papers)
- Arham Khan (7 papers)
- Daniel Grzenda (5 papers)
- Nathaniel Hudson (16 papers)
- André Bauer (11 papers)
- Kyle Chard (87 papers)
- Ian Foster (138 papers)