Pseudo-Relevance Feedback in Zero-Shot Dense Retrieval Using LLMs
This paper investigates the impact and efficacy of pseudo-relevance feedback (PRF) within the field of zero-shot dense retrieval facilitated by LLMs. The authors propose an innovative approach named "PromptPRF," which builds upon the PromptReps method to enhance query representations and improve retrieval performance.
Methodology Overview
The core of the approach lies in leveraging LLMs to extract salient features from the top-ranked documents during an initial retrieval phase. These features include elements like keywords, summaries, and more nuanced constructs such as entities and essays. The extracted features are utilized to refine the query representation in a dense retrieval setting, all within a zero-shot paradigm. This approach is operated offline, providing a significant advantage regarding resource optimization as it does not increase query-time latency.
PromptPRF integrates the following critical components into its framework:
- Initial Retrieval: Queries undergo dense retrieval without additional training via LLM-based embeddings.
- Feature Extraction: LLMs generate passage-level features based on pre-defined prompt templates, enhancing context without introducing excessive noise.
- Query Refinement: The refined query incorporates the features from pseudo-relevant documents, thus improving the retrieval accuracy in subsequent stages.
- Second-Stage Retrieval: The refined query representation is engaged for improved passage ranking, leveraging the contextualized information from PRF.
Experimental Findings
The experiments comprehensively utilize benchmarks from TREC 2019 and 2020. Key observations include:
- Incorporating PRF significantly improves retrieval effectiveness, particularly for smaller dense retrievers, which can match the efficacy of larger models without PRF.
- On TREC DL'19 tasks, PromptPRF enhances nDCG@10 from 0.3695 to 0.5013 for Llama3.2 3B dense retrievers, nearly achieving parity with larger models.
- Smaller models benefit notably from larger feature extractors, indicating the importance of context-rich feature generation. However, diminishing returns are apparent when scaling extractor size for already large dense retrieval models.
Implications and Future Directions
The practical implications of this approach are substantial in scenarios where computational resources may be constrained. PRF allows for reduced hardware requirements in production, which is beneficial for real-time applications like conversational search. The process of PRF being conducted offline further supports the deployment in latency-sensitive environments.
Theoretically, the paper challenges the common scaling laws that correlate dense retrieval effectiveness primarily with model size. By more intelligently leveraging model retrieval strategies, this research outlines a path forward allowing smaller models to bridge the gap traditionally occupied by larger and more resource-intensive configurations.
Future work posited by the authors involves fine-tuning various aspects of the approach, including the examination of optimal PRF depth and combining multiple PRF features to refine effectiveness further.
Overall, this research advances the landscape of dense retrieval systems by innovatively harnessing LLM capabilities to deliver enhanced query representations through strategic pseudo-relevance feedback utilization.