Guiding Retrieval using LLM-based Listwise Rankers (2501.09186v1)

Published 15 Jan 2025 in cs.IR and cs.AI

Abstract: LLMs have shown strong promise as rerankers, especially in listwise'' settings where an LLM is prompted to rerank several search results at once. However, thiscascading'' retrieve-and-rerank approach is limited by the bounded recall problem: relevant documents not retrieved initially are permanently excluded from the final ranking. Adaptive retrieval techniques address this problem, but do not work with listwise rerankers because they assume a document's score is computed independently from other documents. In this paper, we propose an adaptation of an existing adaptive retrieval method that supports the listwise setting and helps guide the retrieval process itself (thereby overcoming the bounded recall problem for LLM rerankers). Specifically, our proposed algorithm merges results both from the initial ranking and feedback documents provided by the most relevant documents seen up to that point. Through extensive experiments across diverse LLM rerankers, first stage retrievers, and feedback sources, we demonstrate that our method can improve nDCG@10 by up to 13.23% and recall by 28.02%--all while keeping the total number of LLM inferences constant and overheads due to the adaptive process minimal. The work opens the door to leveraging LLM-based search in settings where the initial pool of results is limited, e.g., by legacy systems, or by the cost of deploying a semantic first-stage.

PDF Abstract

An Analysis of "Guiding Retrieval Using LLM-based Listwise Rankers"

The paper "Guiding Retrieval Using LLM-based Listwise Rankers" addresses a notable challenge in document retrieval systems: the bounded recall problem inherent in traditional two-stage retrieve-and-rerank pipelines. The primary assertion of this research is the development of a novel adaptive retrieval methodology tailored for listwise rerankers, particularly those leveraging LLMs, to ameliorate this pivotal retrieval constraint.

Key Contributions

The authors propose an innovative adaptation of existing adaptive retrieval methods, specifically tailored for listwise settings executed by LLMs such as RankZephyr and RankVicuna. The paper introduces the SlideGar algorithm, which operates by merging results from both the initial retrieval and feedback documents sourced from the most relevant documents encountered. This method effectively negates the bounded recall issue by dynamically guiding the retrieval process itself, ensuring that relevant documents overlooked in the initial retrieval can be reconsidered in the final reranked list.

Experimental Insights

The empirical findings underscore the potential of SlideGar to enhance performance metrics significantly:

Improvements in nDCG@10 of up to 13.23%.
Increases in recall by 28.02%, demonstrating a remarkable capability to recover relevant documents that were initially omitted.

This research employs a comprehensive suite of evaluations across diverse LLM rankers and retrieval methods, such as BM25 and TCT, and covers multiple datasets, including MSMARCO and TREC Deep Learning benchmarks. The results consistently indicate that adaptive retrieval methods integrated with LLM-based rerankers can achieve substantial gains in both effectiveness and efficiency.

Theoretical and Practical Implications

Theoretically, this work extends the adaptive retrieval paradigm by accommodating listwise ranking models that rely on relative document ordering rather than discrete scoring. The traditional PRP assumption is challenged, marking a shift to a model where inter-document interactions are critical to scoring—this is particularly relevant in the context of LLMs, which process large document lists holistically.

Practically, the SlideGar algorithm allows for more robust application of LLM-based retrieval systems in environments constrained by initial pools, like legacy systems, or cost-associated limitations of semantic search. This presents a pathway for deploying LLM rerankers more broadly, overcoming existing integrations with first-stage systems lacking expressive power.

Future Directions

The paper's findings open several avenues for further research and development. Future work could investigate the scalability of SlideGar across more extensive corpora and different retrieval tasks. Additionally, exploring other forms of feedback signals and ranking techniques that complement listwise analyses further could enhance adaptive retrieval systems' capabilities.

In summary, "Guiding Retrieval Using LLM-based Listwise Rankers" significantly contributes to advancing retrieval systems' utility and performance by integrating adaptive retrieval techniques with the growing field of LLM technology. Its implications for improving search recall and the theoretical shift it implies for processing and ranking information invite ongoing inquiry and expansion in intelligent retrieval methodologies.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Mandeep Rathee (8 papers)
Sean MacAvaney (75 papers)
Avishek Anand (81 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/_reachsumit/status/1880089526410047492

https://twitter.com/rathee_mandeep/status/1908593824135258416