Query Expansion by Prompting Large Language Models (2305.03653v1)

Published 5 May 2023 in cs.IR

Abstract: Query expansion is a widely used technique to improve the recall of search systems. In this paper, we propose an approach to query expansion that leverages the generative abilities of LLMs. Unlike traditional query expansion approaches such as Pseudo-Relevance Feedback (PRF) that relies on retrieving a good set of pseudo-relevant documents to expand queries, we rely on the generative and creative abilities of an LLM and leverage the knowledge inherent in the model. We study a variety of different prompts, including zero-shot, few-shot and Chain-of-Thought (CoT). We find that CoT prompts are especially useful for query expansion as these prompts instruct the model to break queries down step-by-step and can provide a large number of terms related to the original query. Experimental results on MS-MARCO and BEIR demonstrate that query expansions generated by LLMs can be more powerful than traditional query expansion methods.

PDF Abstract

Query Expansion by Prompting LLMs

The paper "Query Expansion by Prompting LLMs" authored by Rolf Jagerman et al., presents an exploration into leveraging the generative capabilities of LLMs for the task of query expansion in Information Retrieval (IR). Unlike traditional methods that rely on Pseudo-Relevance Feedback (PRF), this work utilizes LLMs to generate query expansions by examining their inherent knowledge and generative prowess. In particular, the paper emphasizes the efficacy of Chain-of-Thought (CoT) prompts, which guide LLMs to break down queries sequentially, leading to enriched query expansions.

Methodological Contributions

The authors propose several prompting strategies to enhance query expansion using LLMs, including zero-shot, few-shot, and CoT prompts. These approaches aim to exploit the LLM's intrinsic abilities without the necessity for training or fine-tuning specific to the task:

Prompt Variations: The paper introduces various prompt types such as Query-to-Document (Q2D), Query-to-Expansion (Q2E), and their PRF-augmented versions, along with CoT. These prompts cater to the model's strengths by providing structured, relevant output that aids query expansion.
PRF Incorporation: While traditional query expansion heavily relies on PRF documents' quality, the inclusion of PRF documents with prompts allows models to distill informative expansions even when initial retrieval is not ideal.
Model Size Influence: The paper examines different model sizes to assess performance scalability, offering insights into cost-effective model utilization.

Experimental Evaluation

The authors conducted extensive experiments using datasets like MS-MARCO and BEIR to evaluate the proposed methods' performance against classical query expansion techniques. The core metric for assessment was Recall@1K, with MRR@10 and NDCG@10 providing additional perspectives on retrieval quality. The key findings include:

Effectiveness of CoT Prompts: Across experiments, CoT prompts consistently delivered superior performance, particularly in enhancing recall while maintaining or improving precision.
Balanced Metrics: Unlike some traditional methods that improve recall at the cost of precision, LLM-based query expansions, especially with CoT, managed to enhance retrieval effectiveness across both dimensions.
Scalability with Model Size: Larger models exhibited better performance, but CoT/PRF prompts offered an efficient balance for medium-sized models, suggesting a feasible deployment scope.

Implications and Future Directions

The implications of this work are significant for IR systems seeking to augment recall without compromising precision. By harnessing LLMs, the approach can bypass the dependency on high-quality initial retrieval or exhaustive domain-specific knowledge bases. This could be transformative for domains where access to comprehensive knowledge bases is limited or onerous.

Future research could delve into optimizing these LLM-based expansions in dense retrieval settings, given the vocabulary gap's lesser impact. Additionally, exploring other LLM architectures and finetuning might reveal more efficient strategies. Consideration of how to integrate such a robust query expansion mechanism into production-scale systems is another promising direction, particularly through model distillation efforts to reduce computational overhead.

Overall, the paper contributes significantly to the IR field by showcasing how modern advances in LLMing can address classical challenges in query expansion with an innovative, generalized solution framework.