AgentRank-UC: Dynamic Agent Ranking

Updated 20 November 2025

AgentRank-UC is a dynamic algorithm that fuses recency-weighted usage and competence metrics to rank autonomous agents in challenging environments.
It employs exponential decay, fixed-point iterations, and geometric fusion to balance usage and competence while ensuring privacy and scalability.
Empirical results demonstrate its robustness, efficiency, and sybil-resistance in diverse simulation scenarios and multilabel ranking tasks.

AgentRank-UC is a dynamic algorithm for ranking autonomous agents in web-scale, partially observed interaction environments. Originating as the core ranking algorithm in the DOVIS protocol for the "Web-of-Agents," AgentRank-UC systematically integrates recency-weighted agent usage statistics and multivariate competence metrics, producing robust rankings even under fragmented or adversarial signal sources. The nomenclature “UC” denotes the synthesis of Usage and Competence dimensions in a unified, PageRank-inspired fixed-point framework. Unlike PageRank, which presupposes a transparent and global network topology, AgentRank-UC operates over privacy-preserving, locally aggregated telemetry, supporting performance-aware selection and scalable trust establishment among machine-native agents (Krishnamachari et al., 5 Sep 2025).

1. Construction of Usage and Competence Metrics

AgentRank-UC is built upon two orthogonal edge-weighted interaction graphs over the directed agent interaction space: usage and competence.

Usage Kernel

For each triple (caller $i$ , callee $j$ , task $k$ ), usage is summarized as a recency-decayed count: $N_{ij}^{(k)} = \sum_{t \in \text{calls}} \omega(t), \quad \omega(t) = e^{-\lambda (T - t)}$

$T$ denotes the current time, $\lambda>0$ is the decay rate (half-life $H = \ln2/\lambda$ ). Aggregation yields: $U_{ij} = \sum_k N_{ij}^{(k)}$

The row-stochastic usage kernel $P$ normalizes each caller row: $P_{ij} = \begin{cases} U_{ij}/\sum_{j'} U_{ij'} & \text{if } \sum_{j'} U_{ij'} > 0 \ v_j & \text{otherwise} \end{cases}$ where $v \in \Delta^{n-1}$ is a strictly positive prior.

Competence Kernel

Competence aggregates signed, decayed outcome signals (successes, quality $q$ , latency $\ell$ , cost $c$ , risk $r$ ): $S_{ij}^{(k)} = \sum \omega(t) \cdot z, \quad z \in \{0,1\}$ With these, compute the Beta-Bernoulli mean success: $\widehat{p}_{ij}^{(k)} = \frac{\alpha_0 + S_{ij}^{(k)}}{\alpha_0 + \beta_0 + N_{ij}^{(k)}}$ and task utility: $u_{ij}^{(k)} = \theta_1 \logit(\widehat{p}_{ij}^{(k)}) - \theta_2 \log(1+\overline{\ell}_{ij}^{(k)}) - \theta_3 \log(1+\overline{c}_{ij}^{(k)}) - \theta_4 \overline{r}_{ij}^{(k)} + \theta_5 \overline{q}_{ij}^{(k)}$ Aggregate transformed utilities: $C_{ij} = \sum_k N_{ij}^{(k)} \cdot \phi(u_{ij}^{(k)}), \quad \phi(x)=\log(1+e^{x})$ Normalize as: $Q_{ij} = \begin{cases} C_{ij}/\sum_{j'}C_{ij'} & \text{if } \sum_{j'}C_{ij'} > 0 \ w_j & \text{otherwise} \end{cases}$ with $w \in \Delta^{n-1}$ a strictly positive competence prior.

2. Fixed-point Formulation and Fusion

Let $x \in \Delta^{n-1}$ be the usage rank vector and $y \in \Delta^{n-1}$ the competence rank vector. Damping parameters $\alpha, \beta \in (0,1)$ ensure contractive dynamics: $x = \alpha P^\top x + (1-\alpha) v, \qquad y = \beta Q^\top y + (1-\beta) w$ After convergence, the outputs are fused geometrically, tuning the usage-competence trade-off via $p \in [0,1]$ : $z_j = (x_j)^p \cdot (y_j)^{1-p}, \quad r = z / \|z\|_1$ The final ranking $r$ is a normalized vector in $\Delta^{n-1}$ . This fusion preserves strict positivity, continuity, and monotonicity with respect to submetric improvements.

3. Protocol Infrastructure: DOVIS

The AgentRank-UC algorithm is operationalized atop DOVIS, a five-layer protocol stack:

Discovery: Indexers aggregate OAT-Lite telemetry and periodically compute/publish $P,Q,x,y,r$ .
Orchestration: Callers emit idempotent, per-epoch usage and performance records for each edge $(i,j,k)$ , processed with exponential decay.
Verification: All records are cryptographically signed; optional mutual acknowledgment and randomized audit sampling (1–5% of edges) enforce integrity. Identity strength (e.g., staked/TEE-attested) confers prior weight in $v$ or $w$ .
Incentives: Honest reporters receive exposure rewards; malicious activity penalized by weight slashing or rank suppression. Telemetry credits and cold-start priors ensure fair onboarding for newcomer agents.
Semantics: Schema normalization (latency in ms, cost in credits, qualities/risks in $[0,1]$ ); versioned task taxonomies and support for backward compatibility.

This protocol ensures record authenticity, incentivizes truthful reporting, and protects participant privacy by only aggregating minimal, summary-level telemetry (Krishnamachari et al., 5 Sep 2025).

4. Theoretical Guarantees

AgentRank-UC possesses several formally stated guarantees:

Guarantee	Statement	Implication
Convergence	Power iterations contract in $\ell_1$ , unique fixed points $x^,y^$ exist	Fast stable computation
Fusion	$r = \text{normalize}\left((x^)^p \odot (y^)^{1-p}\right)$ is well-posed and stable	Continuous, robust to input variations
Monotonicity	Improving any submetric on edge $(i,j,k)$ weakly increases $r_j$	Local utility improvements propagate to agent rank
Cold-start	$v,w>0$ via teleport guarantee $x^_j,y^_j,r_j>0$ for all $j$	Newcomers have nonzero baseline visibility
Stability	$\\|P-P'\\|_1 \leq \epsilon \implies \\|x^-x'^\\|_1 \leq \alpha/(1-\alpha)\epsilon$	Small changes yield bounded output shifts
Sybil-resist.	Mass of collusive set $S$ bounded: $(1-\alpha)v_S \leq x_S \leq \alpha+(1-\alpha)v_S$ ; $r_S < 1$	Sybil/pumping attacks are rate-limited; usage-only is not exploitable for full rank share

All bounds and properties hold for positive priors, stochastic kernels, and strict damping ( $\alpha,\beta<1$ ) (Krishnamachari et al., 5 Sep 2025).

5. Algorithmic Implementation

The canonical AgentRank-UC computation consists of:

Telemetry aggregation: For each $(i, j, k)$ , compute decayed counts $N$ , successes $S$ , means $\bar{q},\bar{\ell},\bar{c},\bar{r}$ .
Success posterior: Compute $\widehat{p}_{ij}^{(k)}$ for each edge and task.
Edge weights: Derive usage ( $U_{ij}$ ) and competence ( $C_{ij}$ ) weights by aggregating across tasks and applying the softplus activation to utilities.
Row normalization: Produce $P$ and $Q$ , inserting priors for zero rows.
Fixed-point iterations: Iterate $x \gets \alpha P^\top x + (1-\alpha)v$ and $y \gets \beta Q^\top y + (1-\beta)w$ to convergence.
Fusion: Geometrically combine $x$ and $y$ with parameter $p$ ; normalize to obtain $r$ .

All computations scale linearly with the number of interaction triples, with power-iteration typically converging in fewer than 30 steps.

6. Empirical Evaluation and Scalability

AgentRank-UC has been validated in simulated agent ecosystems consisting of $n=100$ agents executing $d=3$ archetypal tasks, covering cases such as Popular-but-Mediocre (PbM), Niche-but-Excellent (NbE), Balanced-Strong (BS), Cheap-but-Risky (CbR), Sybil-Clique (Syb), and Newcomer-Good (NcG). Key experimental findings include:

Performance: AgentRank-UC nearly matches an oracle success-rate baseline and outperforms usage-only variants in NDCG@10, Quality@10, Spearman's $\rho$ , and Regret@10.
Trade-off tuning: Adjusting the fusion parameter $p$ smoothly interpolates final ranking behaviors between usage-driven and competence-driven extremes.
Adaptivity: Rankings adjust at rates determined by the decay half-life $H$ . Shorter $H$ accelerates demotion/promotion post-performance shocks.
Cold-start and monotonicity: Newcomer agents receive baseline exposure and strictly benefit from additional successes. Informative priors enable rapid onboarding.
Sybil-resistance: Collusive Sybil clusters receive lower aggregate rank mass under AgentRank-UC compared to usage-only (14–17% vs. 11–12%); Sybil mass declines after burn-in phases.
Scalability: Step time is $O(|E|)$ in the number of populated interaction triples; telemetry and index relays operate with low communication overhead.

These results demonstrate that performance-aware agent ranking can be achieved—without global network transparency—given minimal, privacy-preserving, and verifiable telemetry (Krishnamachari et al., 5 Sep 2025).

7. Relation to Partial Feedback Multilabel Ranking

The term "AgentRank-UC" also denotes a multilabel classification and ranking algorithm in partial feedback regimes, where online optimization is guided by second-order upper-confidence bound (UCB) methods (Gentile et al., 2012). This instance targets sequential label selection under partial information, employing UCB exploration and per-label generalized linear modeling. The algorithm admits an $O(\sqrt{T}\log T)$ cumulative regret bound under adversarial covariates. In large-scale experiments on multilabel benchmarks (Mediamill, Sony CSL Paris), AgentRank-UC attained performance within a few percent of full-information baselines, validating the UCB-driven partial feedback methodology.

In summary, AgentRank-UC constitutes a class of dynamic, partially observed ranking algorithms supporting robust, scalable agent selection in open and adversarial environments, with firm theoretical guarantees and validated empirical effectiveness (Krishnamachari et al., 5 Sep 2025, Gentile et al., 2012).