Explainable artificial intelligence model for identifying Market Value in Professional Soccer Players
Abstract: This study introduces an advanced machine learning method for predicting soccer players' market values, combining ensemble models and the Shapley Additive Explanations (SHAP) for interpretability. Utilizing data from about 12,000 players from Sofifa, the Boruta algorithm streamlined feature selection. The Gradient Boosting Decision Tree (GBDT) model excelled in predictive accuracy, with an R-squared of 0.901 and a Root Mean Squared Error (RMSE) of 3,221,632.175. Player attributes in skills, fitness, and cognitive areas significantly influenced market value. These insights aid sports industry stakeholders in player valuation. However, the study has limitations, like underestimating superstar players' values and needing larger datasets. Future research directions include enhancing the model's applicability and exploring value prediction in various contexts.
- Stephen Dobson. The economics of football, volume 10. 2001.
- Prediction Markets. Journal of Economic Perspectives, 18(2):107–126, June 2004. ISSN 0895-3309. doi:10.1257/0895330041371321. URL https://www.aeaweb.org/articles?id=10.1257/0895330041371321.
- Predictive analysis and modelling football results using machine learning approach for English Premier League. International Journal of Forecasting, 35(2):741–755, April 2019. ISSN 0169-2070. doi:10.1016/j.ijforecast.2018.01.003. URL https://www.sciencedirect.com/science/article/pii/S0169207018300116.
- Mustafa A. AL-ASADI and Sakir Tasdemir. Predict the Value of Football Players Using FIFA video game data and Machine Learning Techniques. IEEE Access, pages 1–1, January 2022. doi:10.1109/access.2022.3154767. MAG ID: 4214589397 S2ID: b85f0efabc3fdce4bd997449f22eefe56e50b319.
- Estimating transfer fees of professional footballers using advanced performance metrics and machine learning. European Journal of Operational Research, 306(1):389–399, April 2023. ISSN 03772217. doi:10.1016/j.ejor.2022.06.033. URL https://linkinghub.elsevier.com/retrieve/pii/S0377221722005082.
- Predicting transfer fees in professional European football before and during COVID-19 using machine learning. European Sport Management Quarterly, pages 1–21, December 2022. ISSN 1618-4742, 1746-031X. doi:10.1080/16184742.2022.2153898. URL https://www.tandfonline.com/doi/full/10.1080/16184742.2022.2153898.
- Moshe Adler. Stardom and Talent. The American Economic Review, 75(1):208–212, 1985. ISSN 0002-8282. URL https://www.jstor.org/stable/1812714. Publisher: American Economic Association.
- Talent and/or Popularity: What Does It Take to Be a Superstar? Economic Inquiry, 50(1):202–216, 2012. ISSN 1465-7295. doi:10.1111/j.1465-7295.2010.00360.x. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1465-7295.2010.00360.x. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1465-7295.2010.00360.x.
- Sherwin Rosen. The Economics of Superstars. The American Economic Review, 71(5):845–858, 1981. ISSN 0002-8282. URL https://www.jstor.org/stable/1803469. Publisher: American Economic Association.
- Beyond crowd judgments: Data-driven estimation of market value in association football. European Journal of Operational Research, 263(2):611–624, December 2017. ISSN 03772217. doi:10.1016/j.ejor.2017.05.005. URL https://linkinghub.elsevier.com/retrieve/pii/S0377221717304332.
- An Analysis of Transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211–252, 1964. ISSN 0035-9246. URL https://www.jstor.org/stable/2984418. Publisher: [Royal Statistical Society, Wiley].
- Jason Osborne. Improving your data transformations: Applying the Box-Cox transformation. Practical Assessment, Research, and Evaluation, 15(1), November 2019. ISSN 1531-7714. doi:https://doi.org/10.7275/qbpc-gk17. URL https://scholarworks.umass.edu/pare/vol15/iss1/12.
- SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 17(3):261–272, March 2020. ISSN 1548-7105. doi:10.1038/s41592-019-0686-2. URL https://doi.org/10.1038/s41592-019-0686-2.
- Wrappers for feature subset selection. Artificial Intelligence, 97(1):273–324, December 1997. ISSN 0004-3702. doi:10.1016/S0004-3702(97)00043-X. URL https://www.sciencedirect.com/science/article/pii/S000437029700043X.
- Jamal I. Daoud. Multicollinearity and Regression Analysis. Journal of Physics: Conference Series, 949(1):012009, December 2017. ISSN 1742-6596. doi:10.1088/1742-6596/949/1/012009. URL https://dx.doi.org/10.1088/1742-6596/949/1/012009. Publisher: IOP Publishing.
- Feature Selection with the Boruta Package. Journal of Statistical Software, 36:1–13, September 2010. ISSN 1548-7660. doi:10.18637/jss.v036.i11. URL https://doi.org/10.18637/jss.v036.i11.
- Consistent Feature Selection for Pattern Recognition in Polynomial Time. Journal of Machine Learning Research, 8(21):589–612, 2007. ISSN 1533-7928. URL http://jmlr.org/papers/v8/nilsson07a.html.
- A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55(1):119–139, August 1997. ISSN 0022-0000. doi:10.1006/jcss.1997.1504. URL https://www.sciencedirect.com/science/article/pii/S002200009791504X.
- LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html.
- Tin Kam Ho. Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition, volume 1, pages 278–282 vol.1, August 1995. doi:10.1109/ICDAR.1995.598994. URL https://ieeexplore.ieee.org/document/598994.
- Jerome H. Friedman. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29(5):1189–1232, 2001. ISSN 0090-5364. URL https://www.jstor.org/stable/2699986. Publisher: Institute of Mathematical Statistics.
- CatBoost: gradient boosting with categorical features support, October 2018. URL http://arxiv.org/abs/1810.11363. arXiv:1810.11363 [cs, stat].
- XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA, August 2016. Association for Computing Machinery. ISBN 978-1-4503-4232-2. doi:10.1145/2939672.2939785. URL https://dl.acm.org/doi/10.1145/2939672.2939785.
- G.I. Webb and Z. Zheng. Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8):980–991, August 2004. ISSN 1558-2191. doi:10.1109/TKDE.2004.29. Conference Name: IEEE Transactions on Knowledge and Data Engineering.
- Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery, 8(4):e1249, 2018. ISSN 1942-4795. doi:10.1002/widm.1249. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/widm.1249. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1249.
- Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6):1100–1122, 2017. Publisher: Sage Publications Sage CA: Los Angeles, CA.
- Sebastian Raschka. Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808, 2018.
- Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, 23(1):18, January 2021. ISSN 1099-4300. doi:10.3390/e23010018. URL https://www.mdpi.com/1099-4300/23/1/18. Number: 1 Publisher: Multidisciplinary Digital Publishing Institute.
- A Machine Learning Approach to Assess Injury Risk in Elite Youth Football Players. Medicine & Science in Sports & Exercise, 52(8):1745, August 2020. ISSN 0195-9131. doi:10.1249/MSS.0000000000002305. URL https://journals.lww.com/acsm-msse/fulltext/2020/08000/a_machine_learning_approach_to_assess_injury_risk.12.aspx.
- Unifying local and global model explanations by functional decomposition of low dimensional structures, February 2023. URL http://arxiv.org/abs/2208.06151. arXiv:2208.06151 [cs, math, stat] version: 2.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.