- The paper presents PlayerTV, an automated system that integrates multi-object tracking, OCR, and color clustering to generate player-specific soccer highlights.
- Its modular pipeline, featuring advanced Deep-EIoU tracking and refined color analysis in RGB and CIELAB spaces, achieves 91.5% team mapping accuracy.
- The system includes an interactive GUI for parameter tuning and demonstrates potential for real-time sports analytics despite processing at one frame per second.
"PlayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips"
Introduction
The paper introduces PlayerTV, an innovative framework that allows for the automatic generation of player-specific highlight clips in soccer, leveraging state-of-the-art AI technologies. The framework efficiently combines object detection and tracking, Optical Character Recognition (OCR), and color analysis to produce comprehensive and targeted video clips. The traditional labor-intensive task of manually annotating and processing soccer videos is substantially automated, paving the way for enhanced sports analytics and enriched fan engagement.
Framework Overview
PlayerTV functions as a modular pipeline consisting of five primary components: player tracking, RGB analysis, OCR, team mapping, and player mapping, all wrapped in a user-friendly Graphical User Interface (GUI).
Figure 1: PlayerTV framework overview.
Player Tracking Module
The player tracking module is based on the Deep-EIoU tracker, which is optimized for Multi-Object Tracking (MOT) in sports scenarios, outperforming previous trackers like DeepSORT. It is designed to extract Intersection-over-Union (IoU) and BRISQUE scores from tracklets, which are used for mapping player identities reliably.
Figure 2: Sample tracklets. Both noise from other players as well as occlusion makes it difficult to correctly detect kit numbers.
RGB and Clustering
The RGB module performs clustering in the RGB and CIELAB color spaces, effectively differentiating team kits through color analysis. The approach, highlighted by the extraction of RGB values from specific regions of interest within the tracklets, is crucial for maintaining accuracy in diverse lighting conditions.
Figure 3: Sample regions of interest within tracklets.
Figure 4: Sample clustering. Illustrated in RGB space.
OCR Module
OCR techniques, using PaddleOCR or EasyOCR, identify kit numbers from the cropped player images. A scoring function combining IoU and BRISQUE scores ranks image crops to prioritize clearer views for OCR processes. Future improvements suggest developing task-specific models to enhance precision further.
Evaluation
The efficacy of PlayerTV is tested on datasets from the Norwegian Eliteserien league. The evaluations focus on three main performance metrics: team mapping, OCR, and combined player mapping accuracy, with CIELAB color space demonstrating superior results compared to weighted RGB under complex conditions.
Figure 5: Sample frame from Video 9 - Challenging light conditions make weighted RGB perform worse than CIELAB.
Numerical Results
The overall team mapping accuracy reached 91.5% with CIELAB, while OCR accuracy for identifying correct kit numbers peaked at 30.6%. The combination of methods achieved a slightly lower accuracy, limited by the OCR component, suggesting future enhancements in this area could lead to overall improvements.
Interactive GUI
The GUI efficiently interfaces with the core pipeline, allowing users to set parameters and produce custom clips. Its design facilitates ease of use and operational efficiency, managing live video inputs primarily in HLS playlists, with future expansions planned for local MP4 compatibility.
PlayerTV's processing speed is approximately 1 frame per second on a high-performance computing cluster, signifying potential areas for optimization in hardware acceleration and algorithm efficiency.
Conclusion
PlayerTV significantly automates the video annotation process in sports analytics, demonstrating substantial promise for analyzing player performance and fan engagement strategies. Future development will likely focus on refining OCR accuracy and improving real-time processing capabilities to broaden the framework's applicability across different sports and scenarios. The open-source nature of the project encourages further research and development within the community, suggesting potential advancements in player tracking technology and automated highlight clip generation.