Can't Remember Details in Long Documents? You Need Some R&R
Abstract: Long-context LLMs hold promise for tasks such as question-answering (QA) over long documents, but they tend to miss important information in the middle of context documents (arXiv:2307.03172v3). Here, we introduce $\textit{R&R}$ -- a combination of two novel prompt-based methods called $\textit{reprompting}$ and $\textit{in-context retrieval}$ (ICR) -- to alleviate this effect in document-based QA. In reprompting, we repeat the prompt instructions periodically throughout the context document to remind the LLM of its original task. In ICR, rather than instructing the LLM to answer the question directly, we instruct it to retrieve the top $k$ passage numbers most relevant to the given question, which are then used as an abbreviated context in a second QA prompt. We test R&R with GPT-4 Turbo and Claude-2.1 on documents up to 80k tokens in length and observe a 16-point boost in QA accuracy on average. Our further analysis suggests that R&R improves performance on long document-based QA because it reduces the distance between relevant context and the instructions. Finally, we show that compared to short-context chunkwise methods, R&R enables the use of larger chunks that cost fewer LLM calls and output tokens, while minimizing the drop in accuracy.
- Extending context window of large language models via positional interpolation. arXiv preprint arXiv:2306.15595.
- Longlora: Efficient fine-tuning of long-context large language models. arXiv preprint arXiv:2309.12307.
- The curious case of neural text degeneration. In International Conference on Learning Representations.
- Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:452–466.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. NeurIPS.
- Pretrained transformers for text ranking: Bert and beyond. Springer Nature.
- Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172.
- Amirkeivan Mohtashami and Martin Jaggi. 2023. Landmark attention: Random-access infinite context length for transformers. arXiv preprint arXiv:2305.16300.
- Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784–789, Melbourne, Australia. Association for Computational Linguistics.
- Parallel context windows for large language models. In ACL.
- On position bias in summarization with large language models. arXiv preprint arXiv:2310.10570.
- Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 300–325, Online. Association for Computational Linguistics.
- Roformer: Enhanced transformer with rotary position embedding. arXiv preprint arXiv:2104.09864.
- Found in the middle: Permutation self-consistency improves listwise ranking in large language models. arXiv preprint arXiv:2310.07712.
- Focused transformer: Contrastive training for context scaling. arXiv preprint arXiv:2307.03170.
- Neural text generation with unlikelihood training. In International Conference on Learning Representations.
- Jason Weston and Sainbayar Sukhbaatar. 2023. System 2 attention (is something you might need too). arXiv preprint arXiv:2311.11829.
- Retrieval meets long context large language models. In The Twelfth International Conference on Learning Representations.
- Re-reading improves reasoning in language models. In International Conference on Learning Representations.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.