Papers
Topics
Authors
Recent
Search
2000 character limit reached

Recursive Tournament Voting (RTV)

Updated 23 April 2026
  • RTV is a recursive aggregation method that uses tournament-style elimination and pairwise or small-group comparisons to select a high-quality winner from complex candidate sets.
  • It employs a recursive construction that guarantees performance up to Θ(√N) and improves empirical pass@1 rates in LLM coding benchmarks.
  • RTV bridges theoretical social choice with practical LLM coding applications, demonstrating robustness against noise and manipulation through structured summary-based evaluations.

Recursive Tournament Voting (RTV) is a family of recursive aggregation procedures that select a single high-quality winner from a set of candidates or outputs by repeatedly applying tournament-style elimination. RTV is implemented both as a graph-theoretic device in social choice theory via voting trees on tournaments (Iglesias et al., 2012), and as a modern LLM inference-time method for ranking agentic coding rollouts based on summary representations (Kim et al., 16 Apr 2026). RTV combines recursive structure, pairwise or small-group comparisons, and summary-based selection to robustly extract winning candidates in domains where direct global comparison is infeasible due to combinatorial, contextual, or representational complexity.

1. Theoretical Foundations: Voting Trees on Tournaments

Let TT be a tournament on NN candidates, represented as a directed complete graph or adjacency matrix A{0,1}N×NA\in\{0,1\}^{N\times N}. For iji\neq j, Aij=1A_{ij}=1 iff ii defeats jj in pairwise comparison, with Aii=0A_{ii}=0. A voting tree Θ\Theta is a complete binary tree whose leaves are labeled (possibly with repetitions) from {1,,N}\{1,\dots,N\}. The evaluation is recursive:

  • A leaf labeled NN0 has value NN1.
  • For an internal node NN2 with children evaluating to NN3 and NN4, NN5's label is the winner NN6, defined as NN7 if NN8 or NN9, and A{0,1}N×NA\in\{0,1\}^{N\times N}0 otherwise.
  • The root’s value A{0,1}N×NA\in\{0,1\}^{N\times N}1 is the tree’s selected candidate.

The function A{0,1}N×NA\in\{0,1\}^{N\times N}2 specifies the winner for each tournament A{0,1}N×NA\in\{0,1\}^{N\times N}3. Performance is quantified by the minimum out-degree, A{0,1}N×NA\in\{0,1\}^{N\times N}4, achieved by the winner across all tournaments.

2. Recursive Construction and Performance Guarantees

The canonical RTV structure, A{0,1}N×NA\in\{0,1\}^{N\times N}5, is constructed inductively to optimize the out-degree guarantee:

  • Start from A{0,1}N×NA\in\{0,1\}^{N\times N}6, a binary tree for two candidates (guarantee 1).
  • For size A{0,1}N×NA\in\{0,1\}^{N\times N}7, if A{0,1}N×NA\in\{0,1\}^{N\times N}8’s guarantee is A{0,1}N×NA\in\{0,1\}^{N\times N}9, construct iji\neq j0 by:
    • Using a “one-against-iji\neq j1” gadget iji\neq j2, whose leaves are labeled pairs iji\neq j3 for all iji\neq j4.
    • For each leaf of iji\neq j5, graft a relabeled copy of iji\neq j6 instantiated on an iji\neq j7-subset of the remaining candidates.
  • In each step, pigeonhole arguments ensure at least iji\neq j8 candidates make it to the final bracket, yielding a winner of degree at least iji\neq j9.
  • Solving Aij=1A_{ij}=10 gives Aij=1A_{ij}=11: the performance guarantee is at least Aij=1A_{ij}=12 for general Aij=1A_{ij}=13 (Iglesias et al., 2012).

This surpasses the earlier logAij=1A_{ij}=14 lower bound and is currently the best known guarantee for winner out-degree in recursive tournament selection.

3. RTV in Test-Time LLM Coding: Summary-Based Population Selection

In the agentic coding context, each “candidate” is a rollout trajectory Aij=1A_{ij}=15 consisting of sequences of LLM thoughts, commands, and observations. Raw rollouts are high-dimensional, noisy, and long, complicating direct ranking. RTV as instantiated in (Kim et al., 16 Apr 2026) proceeds as follows:

  • Rollout Summarization: Each rollout Aij=1A_{ij}=16 is mapped to a structured summary Aij=1A_{ij}=17 via an LLM summarizer Aij=1A_{ij}=18, retaining hypotheses, resolved/unresolved failures, progress, and suggested fixes.
  • Groupwise Comparison: Summaries are partitioned into groups of size Aij=1A_{ij}=19 (often ii0). Each group undergoes ii1 independent LLM-based votes to select the “most promising” summary.
  • Recursive Elimination: Winners from each group form the next round’s population; the process recurses until one summary (rollout) remains.
  • Voting Criterion: The LLM is prompted with the problem specification and group summaries, voting for the most likely to reach correct resolution.

This process is formalized as:

ii2

The process terminates in ii3 rounds.

4. Algorithmic Structure and Complexity

The high-level RTV algorithm is as follows:

  • Input: Population of ii4 candidates (either tournament entries or rollout summaries).
  • Step 1: Partition into groups of size ii5 (last group possibly smaller).
  • Step 2: For each group, conduct ii6 independent votes using a comparison function (pairwise match for voting trees, LLM majority for rollouts).
  • Step 3: Advance group winners to the next round. Repeat until a single winner is selected.

Complexity:

  • For voting trees: Height is ii7; each node invokes a pairwise comparison.
  • For LLM-based RTV: Summarization requires ii8 LLM calls; voting uses ii9 calls. Total complexity jj0.
  • With jj1, the number of rounds is jj2.

Empirical settings in coding agents use jj3 for robust performance gains (Kim et al., 16 Apr 2026).

5. Extension: Manipulation-Resistant and Arithmetic Voting Trees

Beyond basic winner selection, voting tree constructions exhibit notable expressiveness:

  • Manipulation Resistance: There exist trees (e.g., jj4 for jj5 a power of 2, jj6) such that for any “perfect-manipulator tournament”—where a distinguished jj7 beats class jj8, jj9 beats Aii=0A_{ii}=00, Aii=0A_{ii}=01 beats Aii=0A_{ii}=02—the tree never elects Aii=0A_{ii}=03 as winner (Iglesias et al., 2012).
  • Arithmetic Circuits: For Aii=0A_{ii}=04, voting trees can implement arithmetic operations mod 3, such as negation, addition, squaring, and multiplication, via wiring of “gates” constructed from smaller voting trees.

This demonstrates that the recursive structure at each node (despite performing only a basic pairwise match) enables combinatorially and algebraically rich global computation within the binary tree (Iglesias et al., 2012).

6. Empirical Performance and Theoretical Limitations

Empirical results on agentic coding benchmarks (SWE-Bench Verified, Terminal-Bench v2.0) indicate that summary-based RTV improves pass@1 rates by Aii=0A_{ii}=05 percentage points across multiple LLMs (Claude-4.5-Opus, Gemini-3.1-Pro, GPT-5-0825), and consistently selects higher-quality rollouts versus majority voting or best-of-Aii=0A_{ii}=06 (Kim et al., 16 Apr 2026). Gains derive from pruning noisy or overfit rollouts early, and focus on high-value hypothesis/diagnostic content through structured summaries.

Limitations:

  • For voting trees, no construction exceeds a Aii=0A_{ii}=07-fraction of the Copeland (max-out-degree) guarantee; the current lower bound remains at Aii=0A_{ii}=08, well below the trivial Aii=0A_{ii}=09 maximal degree (Iglesias et al., 2012).
  • Summaries, while compact, may lose critical information if summarization quality degrades.
  • The use of LLMs for voting introduces stochasticity and possible bias in selection, though repeated independent votes mitigate this effect.

Open questions include closing the performance gap between Θ\Theta0 and Θ\Theta1 in voting trees, extending the arithmetic circuit framework to larger algebraic structures, and whether randomized or adaptive (rather than deterministic) voting tree architectures achieve stronger guarantees (Iglesias et al., 2012).

7. Significance and Connections

Recursive Tournament Voting unites combinatorial social choice, computational tournament design, and LLM-based evaluation into a flexible, robust paradigm for selection over large, noisy populations. The approach provides:

  • Provable worst-case guarantees in combinatorial settings.
  • Practical, scalable performance improvements for selection from complex agentic outputs, where direct scoring is infeasible.
  • A foundation for further research on manipulating and extending tournament structures, both for aggregation and for implementation of implicit computation.

The recursive structure and reliance on small-group, context-sensitive evaluation underpin RTV’s robustness to noise, manipulability, and context obfuscation. These properties position RTV as a significant tool both in theoretical social choice and in practical LLM-based agent design (Iglesias et al., 2012, Kim et al., 16 Apr 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Recursive Tournament Voting (RTV).