Lasso Multinomial Performance Indicators for in-play Basketball Data (2406.09895v2)
Abstract: A typical approach to quantify the contribution of each player in basketball uses the plus-minus method. The ratings obtained by such a method are estimated using simple regression models and their regularized variants, with response variable being either the points scored or the point differences. To capture more precisely the effect of each player, detailed possession-based play-by-play data may be used. This is the direction we take in this article, in which we investigate the performance of regularized adjusted plus-minus (RAPM) indicators estimated by different regularized models having as a response the number of points scored in each possession. Therefore, we use possession play-by-play data from all NBA games for the season 2021-22 (322,852 possessions). We initially present simple regression model-based indices starting from the implementation of ridge regression which is the standard technique in the relevant literature. We proceed with the lasso approach which has specific advantages and better performance than ridge regression when compared with selected objective validation criteria. Then, we implement regularized binary and multinomial logistic regression models to obtain more accurate performance indicators since the response is a discrete variable taking values mainly from zero to three. Our final proposal is an improved RAPM measure which is based on the expected points of a multinomial logistic regression model where each player's contribution is weighted by his participation in the team's possessions. The proposed indicator, called weighted expected points (wEPTS), outperforms all other RAPM measures we investigate in this study.
- \bibcommenthead
- Deshpande S, Jensen S (2016) Estimating an NBA player’s impact on is team’s chances of winning. Journal of Quantitative Analysis in Sports 12:51–72. 10.1515/jqas-2015-0027
- Fearnhead P, Taylor B (2010) On estimating the ability of NBA players. Journal of Quantitative Analysis in Sports 7:11–11. 10.2202/1559-0410.1298
- George E, McCulloch R (1993) Variable selection via Gibbs sampling. Journal of The American Statistical Association 88:881–889. 10.1080/01621459.1993.10476353
- Hoerl R (2020) Ridge regression: A historical context. Technometrics 62:420–425. 10.1080/00401706.2020.1742207
- Hvattum LM (2019) A comprehensive review of plus-minus ratings for evaluating individual players in team sports. International Journal of Computer Science in Sport 18:1 – 23. URL https://api.semanticscholar.org/CorpusID:201734379
- Hvattum LM (2020) Offensive and defensive plus–minus player ratings for soccer. Applied Sciences 10(20):7345. 10.3390/app10207345
- Hvattum LM, Gelade G (2021) Comparing bottom-up and top-down ratings for individual soccer players. International Journal of Computer Science in Sport 20:23–42. 10.2478/ijcss-2021-0002
- Macdonald B (2011) An improved adjusted plus-minus statistic for NHL players. Proceedings of the MIT Sloan Sports Analytics Conference
- Pelechrinis K (2019) Calculating RAPM. GitHub repository, URL {https://github.com/kpelechrinis/NBA_Tutorials/tree/master/rapm}
- Rosenbaum D (2004) Measuring how NBA players help their teams win. Blog: 82games, URL http://www.82games.com/comm30.htm
- Schultze S, Wellbrock CM (2017) A weighted plus/minus metric for individual soccer player performance. Journal of Sports Analytics 4:1–11. 10.3233/JSA-170225
- Sill J (2010) Improved nba adjusted +/- using regularization and out-of-sample testing. Paper presented at the MIT Sloan Sports Analytics Conference, 6 March 2010
- Sæbø O, Hvattum LM (2018) Modelling the financial contribution of soccer players to their clubs. Journal of Sports Analytics 5:1–12. 10.3233/JSA-170235
- Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288