Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inference of rankings planted in random tournaments (2407.16597v1)

Published 23 Jul 2024 in math.ST, cs.CC, cs.DS, math.CO, math.PR, and stat.TH

Abstract: We consider the problem of inferring an unknown ranking of $n$ items from a random tournament on $n$ vertices whose edge directions are correlated with the ranking. We establish, in terms of the strength of these correlations, the computational and statistical thresholds for detection (deciding whether an observed tournament is purely random or drawn correlated with a hidden ranking) and recovery (estimating the hidden ranking with small error in Spearman's footrule or Kendall's tau metric on permutations). Notably, we find that this problem provides a new instance of a detection-recovery gap: solving the detection problem requires much weaker correlations than solving the recovery problem. In establishing these thresholds, we also identify simple algorithms for detection (thresholding a degree 2 polynomial) and recovery (outputting a ranking by the number of "wins" of a tournament vertex, i.e., the out-degree) that achieve optimal performance up to constants in the correlation strength. For detection, we find that the above low-degree polynomial algorithm is superior to a natural spectral algorithm. We also find that, whenever it is possible to achieve strong recovery (i.e., to estimate with vanishing error in the above metrics) of the hidden ranking, then the above "Ranking By Wins" algorithm not only does so, but also outputs a close approximation of the maximum likelihood estimator, a task that is NP-hard in the worst case.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Emmanuel Abbe. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research, 18(1):6446–6531, 2017.
  2. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):1–27, 2008.
  3. An introduction to random matrices. Cambridge University Press, 2010.
  4. Noga Alon. Ranking tournaments. SIAM Journal on Discrete Mathematics, 20(1):137–142, 2006.
  5. Andrew C Berry. The accuracy of the Gaussian approximation to the sum of independent variates. Transactions of the american mathematical society, 49(1):122–136, 1941.
  6. Detection-recovery and detection-refutation gaps via reductions from planted clique. In The Thirty Sixth Annual Conference on Learning Theory, pages 5850–5889. PMLR, 2023.
  7. Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), page 268–276, USA, 2008. Society for Industrial and Applied Mathematics.
  8. Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.
  9. Clément L Canonne. A short note on an inequality between KL and TV. arXiv preprint arXiv:2202.07198, 2022.
  10. The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations. The Annals of Probability, 37(1):1–47, 2009.
  11. Spectral MLE: Top-k𝑘kitalic_k rank aggregation from pairwise comparisons. In International Conference on Machine Learning, pages 371–380. PMLR, 2015.
  12. Statistical-computational tradeoffs in planted problems and submatrix localization with a growing number of clusters and submatrices. The Journal of Machine Learning Research, 17(1):882–938, 2016.
  13. Spearman’s footrule as a measure of disarray. Journal of the Royal Statistical Society Series B: Statistical Methodology, 39(2):262–268, 1977.
  14. W Fernandez de la Vega. On the maximum cardinality of a consistent set of arcs in a random tournament. Journal of Combinatorial Theory, Series B, 35(3):328–332, 1983.
  15. Computing with unreliable information. In Proceedings of the twenty-second annual ACM Symposium on Theory of Computing, pages 128–137, 1990.
  16. Sorting with recurrent comparison errors. arXiv preprint arXiv:1709.07249, 2017.
  17. Optimal sorting with persistent comparison errors. arXiv preprint arXiv:1804.07575, 2018.
  18. A tail bound for read-k𝑘kitalic_k families of functions. Random Structures & Algorithms, 47(1):99–108, 2015.
  19. Robert D Gordon. Values of Mills’ ratio of area to bounding ordinate and of the normal probability integral for large values of the argument. The Annals of Mathematical Statistics, 12(3):364–366, 1941.
  20. Optimal bounds for noisy sorting. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1502–1515, 2023.
  21. Top-k𝑘kitalic_k ranking from pairwise comparisons: When spectral ranking is optimal. arXiv preprint arXiv:1603.04153, 2016.
  22. Noisy binary search and its applications. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 881–890. Citeseer, 2007.
  23. How to rank with few errors. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 95–103, 2007.
  24. Tolerant algorithms. In Algorithms–ESA 2011: 19th Annual European Symposium, Saarbrücken, Germany, September 5-9, 2011. Proceedings 19, pages 736–747. Springer, 2011.
  25. Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio. In ISAAC Congress (International Society for Analysis, its Applications and Computation), pages 1–50. Springer, 2019.
  26. R Duncan Luce. Individual choice behavior, volume 4. Wiley New York, 1959.
  27. Active learning for top-k𝑘kitalic_k rank aggregation from noisy comparisons. In International Conference on Machine Learning, pages 2488–2497. PMLR, 2017.
  28. Detection-recovery gap for planted dense cycles. In 36th Annual Conference on Learning Theory (COLT 2023). PMLR, 2023.
  29. Iterative ranking from pair-wise comparisons. Advances in neural information processing systems, 25, 2012.
  30. Tournament ranking with expected profit in polynomial time. SIAM journal on discrete mathematics, 1(3):372–376, 1988.
  31. A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International conference on machine learning, pages 118–126. PMLR, 2014.
  32. Lecture notes on high dimensional statistics, 2017.
  33. Sorting from noisier samples. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 960–972. SIAM, 2017.
  34. Joel Spencer. Optimal ranking of tournaments. Networks, 1(2):135–138, 1971.
  35. Simple, robust and optimal ranking from pairwise comparisons. Journal of machine learning research, 18(199):1–38, 2018.
  36. Computational barriers to estimation from low-degree polynomials. The Annals of Statistics, 50(3):1833–1858, 2022.
  37. Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer New York, 2009.
  38. Noisy sorting capacity. In 2022 IEEE International Symposium on Information Theory (ISIT), pages 2541–2546. IEEE, 2022.
  39. Efficient ranking from pairwise comparisons. In International Conference on Machine Learning, pages 109–117. PMLR, 2013.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com