Optimal Pairwise Comparison Procedures for Subjective Evaluation (2508.17840v1)
Abstract: Audio signal processing algorithms are frequently assessed through subjective listening tests in which participants directly score degraded signals on a unidimensional numerical scale. However, this approach is susceptible to inconsistencies in scale calibration between assessors. Pairwise comparisons between degraded signals offer a more intuitive alternative, eliciting the relative scores of candidate signals with lower measurement error and reduced participant fatigue. Yet, due to the quadratic growth of the number of necessary comparisons, a complete set of pairwise comparisons becomes unfeasible for large datasets. This paper compares pairwise comparison procedures to identify the most efficient methods for approximating true quality scores with minimal comparisons. A novel sampling procedure is proposed and benchmarked against state-of-the-art methods on simulated datasets. Bayesian sampling produces the most robust score estimates among previously established methods, while the proposed procedure consistently converges fastest on the underlying ranking with comparable score accuracy.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run paper prompts using GPT-5.