Papers
Topics
Authors
Recent
Search
2000 character limit reached

Performance rating in chess, tennis, and other contexts

Published 20 Dec 2023 in econ.TH and cs.MA | (2312.12700v1)

Abstract: In this note, I introduce Estimated Performance Rating (PR$e$), a novel system for evaluating player performance in sports and games. PR$e$ addresses a key limitation of the Tournament Performance Rating (TPR) system, which is undefined for zero or perfect scores in a series of games. PR$e$ is defined as the rating that solves an optimization problem related to scoring probability, making it applicable for any performance level. The main theorem establishes that the PR$e$ of a player is equivalent to the TPR whenever the latter is defined. I then apply this system to historically significant win-streaks in association football, tennis, and chess. Beyond sports, PR$e$ has broad applicability in domains where Elo ratings are used, from college rankings to the evaluation of LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Paul CH Albers and Han de Vries “Elo-rating as a tool in the sequential estimation of dominance strengths” In Animal Behaviour Elsevier, 2001, pp. 489–495
  2. Nejat Anbarci, Ching-Jen Sun and M Utku Ünver “Designing practical and fair sequential team contests: The case of penalty shootouts” In Games and Economic Behavior 130 Elsevier, 2021, pp. 25–43
  3. “Psychological pressure in competitive environments: Evidence from a randomized natural experiment” In American Economic Review 100.5, 2010, pp. 2548–64
  4. “Fair elimination-type competitions” In European Journal of Operational Research 287.2, 2020, pp. 528–535 DOI: 10.1016/j.ejor.2020.03.025
  5. “A Revealed Preference Ranking of U.S. Colleges and Universities” In The Quarterly Journal of Economics 128.1, 2012, pp. 425–467 DOI: 10.1093/qje/qjs043
  6. “Gender, competition, and performance: Evidence from chess players” In Quantitative Economics 14.1 Wiley Online Library, 2023, pp. 349–380
  7. Steven J Brams and Mehmet S Ismail “Making the Rules of Sports Fairer” In SIAM Review 60.1 SIAM, 2018, pp. 181–202
  8. “Catch-Up: A Rule That Makes Service Sports More Competitive” In The American Mathematical Monthly 125.9 Taylor & Francis, 2018, pp. 771–796
  9. Danny Cohen-Zada, Alex Krumer and Offer Moshe Shapir “Testing the effect of serve order in tennis tiebreak” In Journal of Economic Behavior & Organization 146 Elsevier, 2018, pp. 106–115
  10. László Csató “UEFA Champions League entry has not satisfied strategyproofness in three seasons” In Journal of Sports Economics 20.7 Sage Publications Sage CA: Los Angeles, CA, 2019, pp. 975–981
  11. László Csató “Tournament Design: How Operations Research Can Improve Sports Rules” Springer Nature, 2021
  12. “Winning by Losing: Incentive Incompatibility in Multiple Qualifiers” In Journal of Sports Economics 19.8, 2018, pp. 1122–1146 DOI: 10.1177/1527002517704022
  13. Arpad E Elo “The Rating of Chess Players, Past and Present” New York: Arco Publishing, 1978
  14. FIDE “FIDE Handbook” Accessed: 01.12.2023, https://handbook.fide.com/chapter/B022022, 2022
  15. FIFA “Revision of the FIFA / Coca-Cola World Ranking” Accessed: 2023-12-17, https://digitalhub.fifa.com/m/f99da4f73212220/original/edbm045h0udbwkqew35a-pdf.pdf, 2018
  16. Mark E Glickman “The Glicko system”, 1995, pp. 9
  17. Dries R Goossens and Frits CR Spieksma “The carryover effect does not influence football results” In Journal of Sports Economics 13.3 Sage Publications Sage CA: Los Angeles, CA, 2012, pp. 288–305
  18. “Computer analysis of world chess champions” In ICGA Journal 29.2 IOS Press, 2006, pp. 65–73
  19. Lars Magnus Hvattum and Halvard Arntzen “Using ELO ratings for match result prediction in association football” Sports Forecasting In International Journal of Forecasting 26.3, 2010, pp. 460–470 DOI: 10.1016/j.ijforecast.2009.10.002
  20. Mehmet S Ismail “Human and Machine Intelligence in n𝑛nitalic_n-Person Games with Partial Knowledge” In arXiv preprint arXiv:2302.13937, 2023
  21. Graham Kendall and Liam J.A. Lenten “When sports rules go awry” In European Journal of Operational Research 257.2, 2017, pp. 377–394 DOI: 10.1016/j.ejor.2016.06.050
  22. Maya Kosoff “There’s a secret Tinder rating system and your score can only be seen by the company” Accessed: 2023-12-17, https://www.businessinsider.com/secret-tinder-rating-system-called-elo-score-can-only-be-seen-by-company-2016-1, 2016
  23. Steffen Künn, Christian Seel and Dainis Zegners “Cognitive Performance in Remote Work: Evidence from Professional Chess” In The Economic Journal 132.643, 2022, pp. 1218–1232 DOI: 10.1093/ej/ueab094
  24. Roel Lambers and Frits C R Spieksma “A mathematical analysis of fairness in shootouts” In IMA Journal of Management Mathematics 32.4, 2021, pp. 411–424 DOI: 10.1093/imaman/dpaa023
  25. Ignacio Palacios-Huerta “The Beautiful Dataset” In Available at SSRN 4665889, 2023
  26. Marc Pauly “Can strategizing in round-robin subtournaments be avoided?” In Social Choice and Welfare 43.1, 2014, pp. 29–46
  27. Radek Pelánek “Applications of the Elo rating system in adaptive educational systems” In Computers & Education 98, 2016, pp. 169–179 DOI: 10.1016/j.compedu.2016.03.017
  28. “Intrinsic Chess Ratings” In Proceedings of the AAAI Conference on Artificial Intelligence 25.1, 2011, pp. 834–839 DOI: 10.1609/aaai.v25i1.7951
  29. Philip Scarf, Muhammad Mat Yusof and Mark Bilbao “A numerical study of designs for sporting contests” In European Journal of Operational Research 198.1, 2009, pp. 190–198 DOI: 10.1016/j.ejor.2008.07.029
  30. In Journal of Quantitative Analysis in Sports 17.2, 2021, pp. 91–105 DOI: doi:10.1515/jqas-2019-0110
  31. “Judging LLM-as-a-judge with MT-Bench and Chatbot Arena”, 2023 arXiv:2306.05685
Citations (1)

Summary

  • The paper introduces a novel Estimated Performance Rating (PR) system that generalizes traditional Tournament Performance Rating (TPR) by handling extreme score scenarios.
  • It formulates PR as an optimization problem based on scoring probabilities and shows equivalence with TPR when TPR is applicable.
  • Historical data from chess, tennis, and football illustrate PR's effectiveness in evaluating performance during win or loss streaks.

Abstract Overview

In competitive sports and games, player performance is commonly assessed using rating systems like the Elo system, established in various domains including chess, tennis, association football, and even fields outside of sports such as college rankings and AI evaluations. This paper presents a novel system known as Estimated Performance Rating (PR), which addresses limitations of the existing Tournament Performance Rating (TPR) by being applicable to all possible scores, including for zero or perfect score streaks.

Development of the PR System

The PR is defined by an optimization problem involving scoring probabilities and can provide a performance rating at any level of play. The study establishes a main theorem confirming that the PR equates to the TPR when the latter is well-defined—that is, other than in cases of zero or perfect scores. Through historical data analysis, the research applies PR to distinguish notable win-streaks in sports like tennis and association football, as well as chess performance. The findings suggest that PR provides an interpretation distinct from TPR and FIDE's Performance Rating (FPR) by emphasizing the maximum probability of a player scoring certain points in a series of games.

Illustrations and Applications

The paper includes examples and tables detailing how PR performs in different contexts compared to TPR and FPR. To illustrate, scenarios are examined where a chess player's TPR is undefined due to a perfect score series, and the equivalent PR is calculated instead. The study also applies this rating system to real-world scenarios, including tennis Grand Slam events and FIFA World Cup teams with perfect scores.

Implementation and Considerations

The implementation details and the code for PR calculations are provided, ensuring transparency and reproducibility. The paper concludes by reiterating the limitations of TPR and the necessity for a dynamic and adaptable system like PR in contexts where traditional ratings fall short. The PR's ability to incorporate the length of win- or loss-streaks into its evaluation demonstrates its potential broader applicability and significance in the current landscape of competitive sports and beyond.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 696 likes about this paper.