Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data Science Approach to predict the winning Fantasy Cricket Team Dream 11 Fantasy Sports (2209.06999v1)

Published 15 Sep 2022 in cs.LG

Abstract: The evolution of digital technology and the increasing popularity of sports inspired the innovators to take the experience of users with a proclivity towards sports to a whole new different level, by introducing Fantasy Sports Platforms FSPs. The application of Data Science and Analytics is Ubiquitous in the Modern World. Data Science and Analytics open doors to gain a deeper understanding and help in the decision making process. We firmly believed that we could adopt Data Science to predict the winning fantasy cricket team on the FSP, Dream 11. We built a predictive model that predicts the performance of players in a prospective game. We used a combination of Greedy and Knapsack Algorithms to prescribe the combination of 11 players to create a fantasy cricket team that has the most significant statistical odds of finishing as the strongest team thereby giving us a higher chance of winning the pot of bets on the Dream 11 FSP. We used PyCaret Python Library to help us understand and adopt the best Regressor Algorithm for our problem statement to make precise predictions. Further, we used Plotly Python Library to give us visual insights into the team, and players performances by accounting for the statistical, and subjective factors of a prospective game. The interactive plots help us to bolster the recommendations of our predictive model. You either win big, win small, or lose your bet based on the performance of the players selected for your fantasy team in the prospective game, and our model increases the probability of you winning big.

Citations (1)

Summary

  • The paper explores using data science and regression-based machine learning to predict player performance and select optimal fantasy cricket teams for platforms like Dream11.
  • It identifies the Extra Trees Regressor model as highly effective for predicting player Dream11 scores, achieving R2 scores of 0.99 for batsmen and 0.97 for bowlers.
  • The research proposes a team selection strategy combining Knapsack and Greedy Algorithms to maximize predicted scores while adhering to fantasy platform constraints like credit limits.

The paper provides a comprehensive exploration into using data science and predictive analytics for selecting optimal fantasy cricket teams on the Dream-11 platform. It employs ML techniques, specifically regression techniques, to predict player performance in various cricket formats, including One-Day Internationals (ODI), Indian Premier League (IPL), and T20 matches. The paper focuses on creating an analytical model that leverages historical player performance, game conditions, and other relevant metrics to maximize the statistical probability of winning in fantasy sports.

Key Aspects of the Methodology:

  1. Data Collection and Preparation:
    • The authors utilize cricket match data from Cricsheet.org, consisting of 3100 YAML files across three formats (ODI, IPL, T20). They convert these files to CSV format for analysis.
    • Features such as runs, strike rates, wickets, and economic rates are extracted and transformed into a format suitable for ML models.
    • Feature engineering is performed to generate additional insights like cumulative strike rate, moving averages, and Dream-11 scores for players.
  2. Machine Learning Models:
    • A crucial part of the paper compared classification models from prior work with regression models for predicting player performance.
    • The PyCaret library, an automated ML tool, was used extensively to find the best regressor model, resulting in the selection of the Extra Trees Regressor (ETR) model due to its strong performance in predicting Dream-11 scores with an R2 score of 0.99 for batsmen and 0.97 for bowlers.
    • Overfitting was checked against the model use, confirming robustness without overfitting despite high R2 values.
  3. Predictive Analytics:
    • Inputs for the predictive model include aspects such as player name, match format, team details, and venue.
    • The model accommodates various user inputs, fetching similar historical data, and transforms it into matrices fed into the ETR model to generate performance predictions.
  4. Team Selection Strategy:
    • To select a fantasy team, the authors deploy a combination of the Knapsack and Greedy Algorithms to comply with Dream-11 constraints (e.g., credit limits, selection limits per team).
    • The goal is to maximize expected Dream-11 scores while adhering to the credit budget cap.
  5. Visualization and Data Insights:
    • The paper highlighted the use of Plotly for generating various interactive plots to provide visual insights into player and team performance, which assisted in understanding strengths and weaknesses.

Conclusions and Implications:

  • The paper underscores the superiority of regression models over classification models in predicting player performance for fantasy cricket applications.
  • By employing data engineering, feature transformation, and ML pipelines, the research demonstrates an effective methodology for analyzing cricket data and optimizing fantasy cricket team selections.
  • Future work could involve automating data update pipelines from Cricsheet.org to maintain real-time relevance of data, potentially enhancing the model’s predictive capabilities.

This research offers a tailored approach to fantasy sports analytics, with a strong foundation in ML algorithms and data-driven decision-making processes that can inform enthusiasts and analysts in building competitive fantasy teams.