Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CoRanking: Collaborative Ranking with Small and Large Ranking Agents (2503.23427v2)

Published 30 Mar 2025 in cs.CL and cs.IR

Abstract: LLMs have demonstrated superior listwise ranking performance. However, their superior performance often relies on large-scale parameters (\eg, GPT-4) and a repetitive sliding window process, which introduces significant efficiency challenges. In this paper, we propose \textbf{CoRanking}, a novel collaborative ranking framework that combines small and large ranking models for efficient and effective ranking. CoRanking first employs a small-size reranker to pre-rank all the candidate passages, bringing relevant ones to the top part of the list (\eg, top-20). Then, the LLM listwise reranker is applied to only rerank these top-ranked passages instead of the whole list, substantially enhancing overall ranking efficiency. Although more efficient, previous studies have revealed that the LLM listwise reranker have significant positional biases on the order of input passages. Directly feed the top-ranked passages from small reranker may result in the sub-optimal performance of LLM listwise reranker. To alleviate this problem, we introduce a passage order adjuster trained via reinforcement learning, which reorders the top passages from the small reranker to align with the LLM's preferences of passage order. Extensive experiments on three IR benchmarks demonstrate that CoRanking significantly improves efficiency (reducing ranking latency by about 70\%) while achieving even better effectiveness compared to using only the LLM listwise reranker.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Wenhan Liu (5 papers)
  2. Xinyu Ma (49 papers)
  3. Yutao Zhu (63 papers)
  4. Lixin Su (15 papers)
  5. Shuaiqiang Wang (68 papers)
  6. Dawei Yin (165 papers)
  7. Zhicheng Dou (113 papers)

Summary

CoRanking: Collaborative Ranking with Small and Large Ranking Agents

In their scholarly work, the authors present a novel framework dubbed "CoRanking," which effectively addresses the significant computational inefficiencies associated with LLMs in the context of passage ranking tasks. The framework skillfully integrates both small and large ranking models to achieve an optimal balance between efficiency and effectiveness. The crux of the innovation lies in reducing the reliance on LLMs for exhaustive listwise reranking processes through a strategic division of tasks between small rerankers and LLM rerankers.

The impetus for this research arises from the recognized limitations of employing LLMs, such as GPT-4, which, despite high performance, are fraught with inefficiency due to their large-scale parameter requirements and the sliding window approach's iterative nature. This methodology often incurs a substantial computational burden, which becomes an efficiency bottleneck particularly when real-time application is a priority.

The CoRanking framework is structured in a three-stage process:

  1. Small Listwise Reranker (SLR): Initially, a small-sized reranker pre-screens candidate passages to elevate the most relevant ones to the top of a candidate list. This pre-ranking limits the scope of passages that need be processed by the LLM, mitigating extensive computational overhead.
  2. Passage Order Adjuster (POA): To counteract the potential sub-optimality in the passage order known to influence LLM reranking effectiveness, a passage order adjuster based on reinforcement learning is introduced. This component rearranges the top passages according to the preferences ingrained in the LLM during pre-training, ensuring alignment with the LLM's inherent biases towards passage order.
  3. LLM Listwise Reranker (LLR): Finally, the LLM is applied exclusively to the top-k passages identified by the preceding stages. This targeted application not only enhances efficiency but also maintains, or indeed improves, the reranking effectiveness when compared to full-scale LLM listwise reranking.

The researchers validate CoRanking's efficacy through extensive experiments conducted on three information retrieval (IR) benchmarks—TREC, BEIR, and BRIGHT. The numerical results are compelling: the framework reportedly improves ranking latency by approximately 70% without sacrificing the effectiveness, as demonstrated by superior performance metrics (NDCG@10) compared to using only the LLM-based listwise reranker.

From a theoretical perspective, CoRanking offers meaningful insights into the dynamic interplay between different sizes of ranking models. It highlights the potential of leveraging smaller, less computationally intensive models to offset the burdens associated with larger models, facilitating more scalable and sustainable AI solutions.

Practically, this research has immediate implications for deploying AI solutions in environments where computational resources are limited or costly. By reducing the computational load without compromising performance, CoRanking enables the deployment of advanced ranking systems on a more flexible range of hardware infrastructures.

Looking forward, the principles underpinning CoRanking could catalyze further advancements in AI resource management. Future work might explore broader applications of this collaborative framework across other context-heavy AI tasks, optimizations for even greater efficiency, or adaptations that account for more diverse LLM architectures beyond those presently available. By bridging the efficiency-gap of LLMs, CoRanking sets a precedent for harmonizing model performance with practical deployment constraints in AI research and beyond.