Decentralised Preference Discovery

Updated 27 December 2025

Decentralised Preference Discovery is a framework for eliciting and combining private agent preferences in systems without a trusted central coordinator.
The Snowveil protocol employs gossip-based aggregation with the Constrained Hybrid Borda rule, achieving consensus in O(n) steps under asynchronous, censorship-resistant communication.
Mood-based decentralized matching uses trial-and-error learning to reach proposer-optimal stable matchings with only local payoff feedback in two-sided markets.

Decentralised Preference Discovery (DPD) concerns the elicitation, aggregation, and collective resolution of agent preferences in settings lacking a trusted coordinator, full information, or synchronous communication. DPD characterizes a class of problems and algorithms where agents must discover and combine subjective preference data—over candidates, alternatives, or potential partners—amid strict decentralization, censorship resistance, and only local or sampled interactions. Recent advances formalize this paradigm for both large-scale social choice and matching markets, emphasizing convergence to stable collective outcomes despite adversarial communication, information sparsity, and unknown preferences (Kotsialou, 20 Dec 2025, Shah et al., 7 Sep 2024).

1. Formal Foundations of Decentralised Preference Discovery

In DPD, a finite set of $n$ agents each privately holds a strict ordering over a candidate set $\mathcal{P} = \{p_1, \dots, p_m\}$ , denoted $R_i$ for agent $v_i$ ; these orderings are not globally accessible. The DPD objective is to design purely peer-to-peer protocols so that, from arbitrary initial states and in the presence of: (i) censorship-resistant communication (no global authority can block/alter messages), (ii) partial information (agents access only their own $R_i$ and peer messages), and (iii) asynchronous message delivery (no global clock), the system almost surely converges to a collectively agreed social outcome, such as a full ranking or designated winner (Kotsialou, 20 Dec 2025).

DPD subsumes decentralized stable matching in two-sided markets: proposers and acceptors ( $P, A$ ) must form stable matches while learning their own preference orderings and without central oversight or direct communication. For such markets, DPD extends to learning the globally preferred, stable matching from local payoff feedback alone (Shah et al., 7 Sep 2024).

2. Protocols and Algorithmic Mechanisms

2.1 Gossip-based Aggregation: The Snowveil Framework

The Snowveil protocol achieves DPD for large-scale electorates by iterated, gossip-style sampling and local candidate locking (Kotsialou, 20 Dec 2025). Each voter $v_i$ maintains a state $s_i(t) \in \mathcal{P} \cup \{\bot\}$ , where $\bot$ denotes unlocked. Unlocked voters repeatedly:

Sample $k$ random peers to collect their preferences.
Aggregate sampled rankings via the Constrained Hybrid Borda (CHB) rule.
Lock to a candidate based on local sample aggregation, using multistage thresholds to boost lock reliability.

Global consensus emerges when a candidate accrues enough locks—i.e., $N_j(t) \geq \lceil Q n \rceil$ for some $p_j$ , $Q > 1/2$ . This protocol is decentralized, asynchronous, and resists censorship or message distortion.

2.2 Mood-Based Trial-and-Error in Decentralized Matching

In decentralized matching, each proposer $P_i$ iteratively selects actions (partner or remaining single) based on a mood state $m_i \in \{C, D, W\}$ representing contentment, discontent, or watchfulness (Shah et al., 7 Sep 2024). The learning rule is entirely uncoupled: proposals depend only on $P_i$ 's own action/payoff history, not on partner preferences, communication, or global state. The action-selection randomized policy and its state transitions are designed so that the cohort of agents probabilistically converges to the unique proposer-optimal stable matching (POSM).

3. Aggregation Rules: The Constrained Hybrid Borda (CHB)

CHB is a modular aggregation rule for DPD gossip samples, parameterized by $(\alpha, \beta, \lambda) \in [0,1]^3$ . For a $k$ -sample of rankings:

Compute Borda scores $B(p_j)$ and first-place counts $t_j$ for each candidate $p_j$ .
If the leading Borda scorer $p_B$ has $t_B \geq \lceil \alpha k \rceil$ , select $p_B$ .
Otherwise, restrict to candidates with strong first-place and Borda support, then elect the candidate maximizing a hybrid of normalized Borda and plurality: $H(p_j) = (1-\lambda) \frac{B(p_j)}{k(m-1)} + \lambda \frac{t_j}{k}$ .
If none meet these constraints, select the pure Borda winner.

CHB ensures positive responsiveness, determinism, and polynomial-time computability. Its fine-grained responsiveness property implies that even minor shifts in Borda score can decisively influence sample outcomes.

4. Theoretical Guarantees: Convergence and Stability

4.1 Snowveil Convergence Analysis

The Snowveil protocol's global state evolves as a finite Markov chain with a bounded, strictly increasing potential function $\Phi(S_t) = \sum_{j=1}^m N_j(t)^2$ . Analysis via submartingale theory establishes that:

Every LOCK increases $\Phi$ by at least 1.
Non-absorbing states (no candidate at quorum) are transient, so almost-sure convergence to quorum occurs in finite time.
Under reasonable parameter choices, expected convergence is $O(n)$ steps, independent of $m$ and robust to choice of $k$ (sample size).
Disturbances such as delayed or out-of-sync update events, censored communication, or adversarial tie-breaking do not prevent consensus provided the aggregation rule satisfies Snowveil’s axiomatic prerequisites (Kotsialou, 20 Dec 2025).

4.2 Stable Matching with Unknown Preferences

The mood-based trial-and-error learning rule forms a regular perturbed Markov process over the system's state space. Applying Young’s stochastic potential method identifies the POSM as the unique stochastically stable state. As the exploration parameter $\epsilon \to 0$ , the time-averaged frequency of POSM approaches one. Unlike the classic Gale–Shapley deferred acceptance (which requires full knowledge of all preferences), the decentralized rule achieves both stability and proposer-optimality using only local reward feedback, at the cost of slower (polynomial-in-agents) convergence (Shah et al., 7 Sep 2024).

5. Empirical Validation and Scalability

Extensive discrete-event simulations validate the protocolic and theoretical claims for both social choice and matching:

In Snowveil, synthetic electorates up to $n=10^4$ demonstrate empirical linear-time convergence across both randomly distributed and polarized preferences. Key metrics include time (#UpdateVoter calls) to reach a stable majority “lock” and proportion of instances selecting the true sample-plurality or CHB winner. Increasing robustness rounds ( $\gamma$ ) improves decision quality and sometimes paradoxically accelerates global convergence (“cautious voter paradox”).
In decentralized matching, for $n=3$ , $m=3$ settings run 1,000 times for $T=10^5$ steps, the frequency of any stable match quickly plateaus, then further transitions into the POSM until nearly all system epochs reside at the global optimum (Kotsialou, 20 Dec 2025, Shah et al., 7 Sep 2024).

6. Applicability, Extensions, and Limitations

DPD protocols are applicable to any context where central aggregation is infeasible or undesirable—large-scale online elections, crowdsourcing, DAO governance, resilient market platforms, and multi-agent consensus under adversarial communication. Their modular design allows substitution of other aggregation rules provided they fulfill determinism and positive responsiveness; settings with richer constraints (e.g., many-to-one matching, tie-rich environments) are accessible via modifications in state transitions and sampling logic.

Limitations and open directions include:

Convergence time is driven by protocol-parameter tuning, notably $\epsilon$ in trial-and-error and sample/robustness parameters in Snowveil. Adaptive parameter scheduling and bandit strategies may improve practical performance.
Robustness to partial payoff observability, noisy utility signals, or collusive manipulation necessitates further study.
Strategic misreporting in social choice or adversarial payoff manipulation in matching settings challenges incentive compatibility.

A plausible implication is that as distributed systems scale in critical domains, decentralized preference discovery frameworks provide an extensible theoretical and empirical foundation for efficient, robust, and trustless global consensus.

7. Comparative Summary

Protocol/Class	Domain	Information Model	Key Guarantee
Snowveil (Kotsialou, 20 Dec 2025)	Social choice (voting)	Private rankings, sample-based gossip	$O(n)$ almost-sure consensus, modular aggregation
Mood-based DPD (Shah et al., 7 Sep 2024)	Two-sided matching markets	Unknown preferences, reward feedback	Probabilistic POSM convergence, no communication

DPD unifies a broad class of decentralized collective decision frameworks, bridging computational social choice, matching theory, and distributed systems with provable guarantees. Recent work demonstrates robust protocols—gossip-driven aggregation and uncoupled adaptive search—that attain strong notions of consensus or optimal stability in fully decentralized, information-constrained environments.

PDF Markdown Chat (Pro)

References (2)

Snowveil: A Framework for Decentralised Preference Discovery (2025)

Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences (2024)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Decentralised Preference Discovery (DPD).