Dice Question Streamline Icon: https://streamlinehq.com

Performance of Voting-as-Evaluation under missing or uneven data

Determine how Voting-as-Evaluation (VasE) and related social choice-based evaluation methods perform when agent evaluation preference data is missing or unevenly distributed across alternatives and tasks, by characterizing their robustness and ranking quality in such sparse or imbalanced evaluation settings.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper frames agent evaluation using computational social choice, specifically Voting-as-Evaluation (VasE), which aggregates preferences or outcomes into rankings. While these methods have attractive theoretical properties, agent evaluation datasets often exhibit substantial sparsity and imbalance due to incomplete pairwise comparisons and uneven representation across tasks or agents.

The authors explicitly note uncertainty about the performance of such voting-based aggregation methods under missing or uneven data, motivating their development of Soft Condorcet Optimization (SCO) and empirical comparisons. This open point concerns the reliability and effectiveness of VasE-style approaches when faced with real-world data sparsity typical in multi-agent and multi-task evaluations.

References

However, it is unclear how well these methods perform when the data is missing or unevenly distributed, which often occurs in the agent evaluation setting.

Soft Condorcet Optimization for Ranking of General Agents (2411.00119 - Lanctot et al., 31 Oct 2024) in Section 2.2 Voting as Evaluation of General Agents