Papers
Topics
Authors
Recent
2000 character limit reached

Multi-round Multi-expert Consensus

Updated 12 December 2025
  • Multi-round Multi-expert Consensus is a framework that uses iterative rounds of expert deliberation and belief updates to aggregate diverse opinions.
  • It integrates algorithmic, statistical, and procedural methods such as weighted voting, Bayesian aggregation, and RL-based negotiation for robust consensus.
  • Applications span AI alignment, blockchain consensus, and human–AI hybrid systems, enhancing transparency, accuracy, and performance.

Multi-round multi-expert consensus (MRMEC) encompasses algorithmic, statistical, and procedural frameworks in which a group of experts—human or artificial—engage in iterative rounds of deliberation, exchange, and belief-update to produce a joint decision or prediction. This paradigm underlies modern ensemble reasoning systems, expert elicitation methods, and multi-agent collaboration architectures, enabling improved accuracy, robustness, and interpretability in domains ranging from code analysis to autonomous systems and collective policy-making (Hahn et al., 10 Oct 2025, Pokharel et al., 2 Apr 2025, Chen et al., 2023, Carvalho et al., 2012, Samanta et al., 18 Sep 2025, Cho et al., 11 Nov 2024, Bolleddu, 20 Nov 2025, Speed et al., 12 Aug 2025).

1. Fundamental Principles and Formal Models

MRMEC protocols assume the presence of NN expert agents (human, LLM, or hybrid), a decision or answer space VV, and a communication or interaction protocol spanning RR rounds. Each agent ii maintains an opinion or prediction oirVo_i^r \in V (possibly with confidence CirC_i^r) at round rr. At each round, agents may observe some combination of:

  • their private information (data, expertise, signals)
  • prior outputs from other agents (opinions, explanations, confidence values)
  • shared evidence scaffolds (summaries, citations, rationales)

Agents update their outputs using decision rules that may depend on the structure of VV (categorical, probabilistic, ranked), the nature of the interaction (synchronous, asynchronous, public, private), and the consensus mechanism (majority, weighted, Bayesian aggregation, unanimity).

Key protocol elements include:

These frameworks are robust to class/domain specialization, resource heterogeneity, and, under certain conditions, adversarial or Byzantine nodes (Pokharel et al., 2 Apr 2025, Cho et al., 11 Nov 2024).

2. Architectures and Methodological Taxonomy

MRMEC can be instantiated through diverse system architectures, including:

(a) LLM–Debate and Specialization Architectures:

MEC3^3O assigns each class cCc\in\mathcal{C} an expert EcE_c based on class-wise macro-F1 performance over a held-out set. Experts receive class-specialized instructions and participate in structured debate rounds, updating predictions conditioned on peers' rationales. Final decisions are selected via a weighted consensus score:

Scorex(c)=i=1CI[pi=c]wi,c\text{Score}_x(c) = \sum_{i=1}^{|\mathcal{C}|} \mathbb{I}[p'_i = c] \cdot w_{i,c}

where wi,c=wE,iwconf,iw_{i,c} = w_{E,i} \cdot w_{conf,i}, and wE,iw_{E,i} gives a bonus for domain-aligned predictions (Hahn et al., 10 Oct 2025).

(b) General Consensus Deliberation:

Deliberation-based protocols, e.g., in blockchain settings, use graded consensus and repeated rounds of critique and update. Honest agent agreement is measured by A(v)=(1/H)iHI[oir=v]A(v) = (1/|H|) \sum_{i\in H} \mathbb{I}[o^r_i = v]; consensus is reached when maximum pairwise disagreement Δmaxr=0\Delta_{max}^r = 0 and minimum confidence CminrθC_{min}^r \geq \theta (Pokharel et al., 2 Apr 2025).

(c) Linear Opinion Pooling:

Opinion pooling frameworks iteratively aggregate experts' probability vectors xi(t)x_i^{(t)} using weights inversely related to their pairwise RMS distances:

xi(t)=j=1nwij(t)xj(t1),wij(t)=αi(t)ϵ+D(xi(t1),xj(t1))x_i^{(t)} = \sum_{j=1}^n w_{ij}^{(t)} x^{(t-1)}_j, \quad w_{ij}^{(t)} = \frac{\alpha_i^{(t)}}{\epsilon + D(x_i^{(t-1)}, x_j^{(t-1)})}

Consensus is guaranteed as tt\to\infty (Carvalho et al., 2012).

(d) Confidence-weighted and Bayesian Approaches:

Ensembles such as ReConcile aggregate LLM outputs via confidence-weighted votes after multi-round exchange, employing explicit calibration functions f(pi)f(p_i) to attenuate over/under-confidence (Chen et al., 2023). Bayesian online learning frameworks dynamically update posteriors over consensus targets using partial feedback, balancing querying cost and accuracy (Showalter et al., 2023).

(e) RL/MARL-based Dialogue Systems:

Dialogue Diplomats implements a Hierarchical Consensus Network (HCN) on top of a multi-agent RL loop. The system integrates agent-level LSTM encoders, a GNN-based attention mechanism for inter-agent communication, and a progressive negotiation protocol with context-aware reward shaping. Consensus bonuses depend on outcome overlap and fairness metrics such as the Gini coefficient (Bolleddu, 20 Nov 2025).

(f) Human–AI Hybrid Delphi:

In HAH-Delphi, small expert panels interact with an AI scaffold that synthesizes literature citations and preliminary ratings; experts iteratively justify, revise, and converge on consensus positions with structured facilitation and reproducible saturation checks (Speed et al., 12 Aug 2025).

3. Consensus Mechanisms and Aggregation Rules

Common final consensus strategies include:

Mechanism Aggregation Rule Notable Use Case
Majority/Plurality argmaxaiI[ai=a]\arg\max_a \sum_i \mathbb{I}[a_i = a] Binary/group decision, debate
Weighted Vote argmaxaif(pi)I[ai=a]\arg\max_a \sum_i f(p_i) \mathbb{I}[a_i = a] LLM ensembles (ReConcile)
Expertise-weighted Bonus for domain assignment, see wE,iw_{E,i} MEC3^3O for specialization
Linear Pool Inverse-distance weights, convex aggregation Probability pooling (Carvalho et al., 2012)
Bayesian Posterior Monte Carlo over multivariate hypergeo/Dirichlet Cost-accuracy tradeoff, human annotation (Showalter et al., 2023)
Graded Consensus Set-acceptance threshold A(v)θA(v) \geq \theta Deliberative blockchains (Pokharel et al., 2 Apr 2025)
Unanimity All oio_i agree, Δmaxr=0\Delta_{max}^r = 0 Classical Bayesian voting (Mossel et al., 2010)

Advanced protocols handle ties, confidence scores, ranked/rated/cumulative ballots (Cho et al., 11 Nov 2024), and explicitly track consensus diversity and convergence.

4. Convergence, Robustness, and Theoretical Guarantees

Several protocols provide formal convergence and correctness guarantees:

  • Unanimity is reached in finite time with probability one under generic conditions in Bayesian voting models (Mossel et al., 2010).
  • Graded consensus protocols with N>3tN>3t tolerable Byzantine agents guarantee consistency, agreement, liveness, and determinism (Pokharel et al., 2 Apr 2025).
  • Linear opinion pools guarantee geometric-rate convergence of probability vectors to the same consensus, under strictly positive inverse-distance weights (Carvalho et al., 2012).
  • RL-based negotiation with monotonic concessions and bounded utilities converges in finite TT with high empirical probability, and fairness/efficiency can be tuned by reward shaping (Bolleddu, 20 Nov 2025).
  • In hybrid protocols, measuring thematic saturation across reasoning categories provides an empirical stopping criterion and ensures depth/completeness of consensus (Speed et al., 12 Aug 2025).

5. Practical Implementations and Empirical Performance

Empirical benchmarks across domains demonstrate that MRMEC systems robustly outperform single-expert or single-round majority approaches. For example:

  • MEC3^3O achieves ~10 percentage points higher accuracy and macro-F1 over open-source and multi-agent baselines, driven by expert specialization, multi-round cross-checking, weighted consensus, and DoT mitigation (Hahn et al., 10 Oct 2025).
  • Dialogue Diplomats yields consensus rates of 94.2% (vs. 78.2% for RL baselines), higher social welfare, and substantially reduced communication rounds by hierarchical agent organization and context-aware reward shaping (Bolleddu, 20 Nov 2025).
  • ReConcile improves LLM ensemble accuracy by up to 11.4% over debate/judge baselines, with model diversity a critical component (Chen et al., 2023).
  • Bayesian online consensus estimation yields cost-effective human annotation, dynamically raising the query rate under distribution shift and outperforming random and entropy-based baselines (Showalter et al., 2023).
  • HAH-Delphi compact panels plus AI scaffolding reproduce 95% of published Delphi results and reach thematic saturation with only six experts, enabling dramatic reductions in panel size and round count (Speed et al., 12 Aug 2025).
  • Multi-agent RL training explicitly aligned to self-consistency consensus signals (MACA) leads to 23–43% gains in reasoning consistency and multi-agent ensemble performance (Samanta et al., 18 Sep 2025).

6. Limitations, Open Challenges, and Domain Adaptation

Addressing degeneration of thought, over-convergence to incorrect majority, hallucination, or adversarial sabotage remains critical in highly automated or adversarial settings. Methods for mitigation include domain-specialized instructions, restricted assent rules, confidence-based weighting, stake-slash incentives, peer-critique reflection, and external fact-checking (Hahn et al., 10 Oct 2025, Pokharel et al., 2 Apr 2025).

Human–AI hybridization—e.g., HAH-Delphi—improves convergence and justification coverage but still depends on expert selection, rigorous facilitation, and manual adjudication for ambiguous or richly conditional items (Speed et al., 12 Aug 2025).

Scalability with large agent panels, communication cost (O(N2)O(N^2) per round for gossip in decentralized settings), and adaptability to asynchronous/dynamic networks present ongoing scaling challenges, for which LoRA fine-tuning, off-chain dialog storage, and protocol-level efficiency improvements have been proposed (Pokharel et al., 2 Apr 2025).

Lastly, generalizing consensus protocols to settings requiring policy rankings, multi-outcome distributions, or negotiation among agents with non-aligned objectives requires further research in multi-objective social choice, incentive-compatible aggregation, and group-fairness-aware reward shaping (Cho et al., 11 Nov 2024, Bolleddu, 20 Nov 2025).

7. Applications and Future Directions

MRMEC methodologies are central to:

Ongoing research focuses on deeper integration of domain-adaptive expertise allocation, flexible consensus semantics (graded, ranked, conditional), robust defense against adversarial sabotage, interpretable confidence measures, and minimal-round convergence guarantees. The increasing adoption of MRMEC in AI alignment, human-in-the-loop ensemble systems, and high-stakes policy frameworks underscores its importance for reliable, transparent, and adaptive decision sciences.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Multi-round Multi-expert Consensus.