Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Context Tuning for Retrieval Augmented Generation (2312.05708v1)

Published 9 Dec 2023 in cs.IR, cs.AI, and cs.LG

Abstract: LLMs have the remarkable ability to solve new tasks with just a few examples, but they need access to the right tools. Retrieval Augmented Generation (RAG) addresses this problem by retrieving a list of relevant tools for a given task. However, RAG's tool retrieval step requires all the required information to be explicitly present in the query. This is a limitation, as semantic search, the widely adopted tool retrieval method, can fail when the query is incomplete or lacks context. To address this limitation, we propose Context Tuning for RAG, which employs a smart context retrieval system to fetch relevant information that improves both tool retrieval and plan generation. Our lightweight context retrieval model uses numerical, categorical, and habitual usage signals to retrieve and rank context items. Our empirical results demonstrate that context tuning significantly enhances semantic search, achieving a 3.5-fold and 1.5-fold improvement in Recall@K for context retrieval and tool retrieval tasks respectively, and resulting in an 11.6% increase in LLM-based planner accuracy. Additionally, we show that our proposed lightweight model using Reciprocal Rank Fusion (RRF) with LambdaMART outperforms GPT-4 based retrieval. Moreover, we observe context augmentation at plan generation, even after tool retrieval, reduces hallucination.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Christopher J.C. Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Microsoft Research Technical Report MSR-TR-2010-82.
  2. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval., pages 758–759.
  3. Realm: Retrieval-augmented language model pre-training.
  4. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents.
  5. Retrieval-augmented generation for knowledge-intensive nlp tasks.
  6. Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172.
  7. Chameleon: Plug-and-play compositional reasoning with large language models. arXiv preprint arXiv:2304.09842.
  8. Query rewriting for retrieval-augmented large language models. arXiv preprint arXiv:2305.14283.
  9. Large dual encoders are generalizable retrievers.
  10. OpenAI. 2023. Gpt-4 technical report.
  11. Large language models are effective text rankers with pairwise ranking prompting. arXiv preprint arXiv:2306.17563v1.
  12. In-context retrieval-augmented language models. arXiv preprint arXiv:2302.00083.
  13. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
  14. Quill: Query intent with large language models using retrieval augmentation and multi-stage distillation. arXiv preprint arXiv:2210.15718v1.
  15. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  16. Improvements to bm25 and language models examined. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval., pages 58–65.
  17. Chain-of-thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Raviteja Anantha (13 papers)
  2. Tharun Bethi (1 paper)
  3. Danil Vodianik (1 paper)
  4. Srinivas Chappidi (7 papers)
Citations (8)