Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models (2310.07712v2)

Published 11 Oct 2023 in cs.CL and cs.LG

Abstract: LLMs exhibit positional bias in how they use context, which especially complicates listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over ranking list outputs of black-box LLMs. Our key idea is to marginalize out different list orders in the prompt to produce an order-independent ranking with less positional bias. First, given some input prompt, we repeatedly shuffle the list in the prompt and pass it through the LLM while holding the instructions the same. Next, we aggregate the resulting sample of rankings by computing the central ranking closest in distance to all of them, marginalizing out prompt order biases in the process. Theoretically, we prove the robustness of our method, showing convergence to the true ranking in the presence of random perturbations. Empirically, on five list-ranking datasets in sorting and passage reranking, our approach improves scores from conventional inference by up to 7-18% for GPT-3.5 and 8-16% for LLaMA v2 (70B), surpassing the previous state of the art in passage reranking. Our code is at https://github.com/castorini/perm-sc.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Raphael Tang (32 papers)
  2. Xinyu Zhang (296 papers)
  3. Xueguang Ma (36 papers)
  4. Jimmy Lin (208 papers)
  5. Ferhan Ture (14 papers)
Citations (9)
Github Logo Streamline Icon: https://streamlinehq.com