Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Query Rewriting for Retrieval-Augmented Large Language Models (2305.14283v3)

Published 23 May 2023 in cs.CL

Abstract: LLMs play powerful, black-box readers in the retrieve-then-read pipeline, making remarkable progress in knowledge-intensive tasks. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs from the perspective of the query rewriting. Unlike prior studies focusing on adapting either the retriever or the reader, our approach pays attention to the adaptation of the search query itself, for there is inevitably a gap between the input text and the needed knowledge in retrieval. We first prompt an LLM to generate the query, then use a web search engine to retrieve contexts. Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline. A small LLM is adopted as a trainable rewriter to cater to the black-box LLM reader. The rewriter is trained using the feedback of the LLM reader by reinforcement learning. Evaluation is conducted on downstream tasks, open-domain QA and multiple-choice QA. Experiments results show consistent performance improvement, indicating that our framework is proven effective and scalable, and brings a new framework for retrieval-augmented LLM.

Query Rewriting for Retrieval-Augmented LLMs: An Expert Overview

Introduction and Motivation

The paper presents a novel framework called Rewrite-Retrieve-Read designed to enhance retrieval-augmented LLMs. Traditional retrieval-augmented architectures follow a retrieve-then-read paradigm, where a retriever first selects relevant knowledge contexts that are subsequently processed by the reader. The paper identifies a significant limitation in this approach: the rigidity in query formulation often creates a mismatch between the input query and the requisite knowledge for accurate retrieval. This research introduces query rewriting as an intermediary step in the pipeline, facilitating more effective context retrieval and subsequent comprehension by the LLM reader.

Methodology

The proposed Rewrite-Retrieve-Read framework is structured in three primary phases:

  1. Query Rewriting: An LLM first generates a reformulated query that better aligns with the retriever's input needs. The initial query formulation aims to bridge the gap between the user input and the retrieval requirement.
  2. Context Retrieval: The rewritten query is used to retrieve documents from an external knowledge source. This paper leverages a real-time web search engine, such as Bing, to fetch relevant data, circumventing the need to maintain a search index and allowing access to contemporary world knowledge.
  3. Reading and Comprehension: The collected contexts are then input into an LLM for reading comprehension and response generation.

The research further proposes a scalable adjustment to train a smaller, adaptive model—a trainable rewriter—employing reinforcement learning techniques. This model aligns the rewritten queries with the static retriever-reader setup by basing training on the LLM reader's feedback, rewarding query formulations that lead to improved downstream task performance.

Experimental Results and Evaluation

The newly introduced framework underwent evaluation across several knowledge-intensive tasks, specifically focusing on open-domain QA and multiple-choice QA. The paper employed both frozen LLM configurations (using models like ChatGPT and Vicuna-13B) and the adaptive trainable rewriter.

Key findings from the experiments indicate substantial performance improvements when query rewriting is integrated into the LLM retrieval pipeline. Notably, for datasets such as HotpotQA and AmbigNQ, query rewriting provided a consistent boost in metrics like Exact Match (EM) and F1_1 scores over the traditional retrieve-then-read schema. The results substantiate the efficacy of query rewriting in enhancing the LLM's information retrieval capacity, thereby empowering the reader to utilize richer context for more accurate predictive outcomes.

Theoretical and Practical Implications

From a theoretical perspective, the introduction of a query rewriting stage propounds a shift in how LLMs can be enhanced through external augmentations. By recasting the task as a multi-step process with aligned intermediary adaptations, the paper sets a precedent for future work that seeks to bridge operational gaps within AI systems, particularly those involving black-box components.

Practically, the pipeline's inclusion of real-time web search engines as retrievers is notable for its operational scalability and flexibility, given the ever-changing nature of world knowledge. This application broadens the horizon of how LLMs could operate in dynamic contexts, offering potential areas for exploration where real-time data acquisition and processing are critical.

Future Prospects

Looking forward, the methodology opens up exploratory avenues in several domains. For instance, optimizing similar rewriter models for interactive AI agents or develop strategies that cater to different tool-based augmentations could significantly expand their applicability. Additionally, investigating how different retrieval models might synergize with this approach could unlock further advancements in AI-driven knowledge systems, enhancing both the depth and reliability of LLM outputs.

Overall, the research provides an empirically supported, theoretically sound method to refine the efficiency and accuracy of retrieval-augmented LLMs, contributing meaningfully to ongoing dialogues about AI enhancement and real-world applicability. The proposed framework not only highlights the need for innovation in query processing but also establishes groundwork for multi-model alignment in AI architectures.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xinbei Ma (19 papers)
  2. Yeyun Gong (78 papers)
  3. Pengcheng He (60 papers)
  4. Hai Zhao (227 papers)
  5. Nan Duan (172 papers)
Citations (62)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com