Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback (2410.21242v1)

Published 28 Oct 2024 in cs.IR, cs.AI, cs.CL, and cs.LG

Abstract: Building effective dense retrieval systems remains difficult when relevance supervision is not available. Recent work has looked to overcome this challenge by using a LLM to generate hypothetical documents that can be used to find the closest real document. However, this approach relies solely on the LLM to have domain-specific knowledge relevant to the query, which may not be practical. Furthermore, generating hypothetical documents can be inefficient as it requires the LLM to generate a large number of tokens for each query. To address these challenges, we introduce Real Document Embeddings from Relevance Feedback (ReDE-RF). Inspired by relevance feedback, ReDE-RF proposes to re-frame hypothetical document generation as a relevance estimation task, using an LLM to select which documents should be used for nearest neighbor search. Through this re-framing, the LLM no longer needs domain-specific knowledge but only needs to judge what is relevant. Additionally, relevance estimation only requires the LLM to output a single token, thereby improving search latency. Our experiments show that ReDE-RF consistently surpasses state-of-the-art zero-shot dense retrieval methods across a wide range of low-resource retrieval datasets while also making significant improvements in latency per-query.

Summary

  • The paper introduces ReDE-RF, which replaces hypothetical document generation with LLM-driven relevance estimation for improved dense retrieval.
  • It combines a hybrid sparse-dense retrieval model with single-token relevance feedback, significantly reducing latency and resource dependency.
  • Experimental results show up to 14% performance improvement and enhanced efficiency in low-resource settings, highlighting its practical scalability.

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback

The paper "Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback" presents a novel approach to enhancing dense retrieval systems in the absence of relevance supervision. The authors introduce Real Document Embeddings from Relevance Feedback (ReDE-RF), a method that optimizes dense retrieval by reframing hypothetical document generation as a task of relevance estimation.

Background and Problem Statement

Dense retrieval has shown superior performance over traditional exact-term matching methods like BM25, especially with the advent of transformer-based models. However, constructing effective dense retrieval systems without substantial labeled data remains challenging. While prior works like HyDE addressed zero-shot retrieval by generating hypothetical documents via LLMs, these methods are heavily dependent on LLMs' parametric knowledge and can introduce inefficiencies and inaccuracies.

Proposed Method: ReDE-RF

ReDE-RF pivots from generating hypothetical documents to estimating relevance using LLMs:

  1. Initial Retrieval: It begins by leveraging a hybrid sparse-dense retrieval model to retrieve a set of initial candidate documents.
  2. Relevance Feedback: An LLM is employed to judge these documents' relevance, eliminating the necessity for domain-specific document generation.
  3. Query Representation Update: Based on the relevance feedback, the query representation is updated using embeddings from actual documents. This method promises efficiency by requiring only a single token output for relevance estimation instead of lengthy hypothetical document generation.

Experimental Results

ReDE-RF demonstrates substantial improvements over existing zero-shot dense retrieval methods across various datasets, particularly in low-resource environments. In specific benchmarks, ReDE-RF surpasses methods like HyDE by up to 14% without the need for relevance supervision. Furthermore, latency per query is significantly reduced, reinforcing the practical benefits of the approach in real-time applications.

Implications and Future Directions

The theoretical implications of ReDE-RF are significant in addressing the limitations of LLM-reliant methods in dense retrieval, especially in scenarios involving out-of-domain corpora. The practical applicability is enhanced through improved efficiency and adaptability across different domains without the need for constant LLM input generation.

Future directions could include exploring methods to reduce ReDE-RF's dependency on initial retrieval quality, potentially examining ways to refine relevance feedback mechanisms or integrate more intelligent initial retrieval strategies. Furthermore, distilling ReDE-RF into smaller, more efficient models, as demonstrated, provides avenues to minimize LLM requirements at inference time, making it feasible for widespread adoption.

ReDE-RF provides a compelling alternative to existing zero-shot retrieval strategies, promising enhanced performance and efficiency by leveraging real document embeddings informed by relevance feedback rather than hypothetical document generation.

Youtube Logo Streamline Icon: https://streamlinehq.com