Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VideoBadminton: A Video Dataset for Badminton Action Recognition (2403.12385v1)

Published 19 Mar 2024 in cs.CV

Abstract: In the dynamic and evolving field of computer vision, action recognition has become a key focus, especially with the advent of sophisticated methodologies like Convolutional Neural Networks (CNNs), Convolutional 3D, Transformer, and spatial-temporal feature fusion. These technologies have shown promising results on well-established benchmarks but face unique challenges in real-world applications, particularly in sports analysis, where the precise decomposition of activities and the distinction of subtly different actions are crucial. Existing datasets like UCF101, HMDB51, and Kinetics have offered a diverse range of video data for various scenarios. However, there's an increasing need for fine-grained video datasets that capture detailed categorizations and nuances within broader action categories. In this paper, we introduce the VideoBadminton dataset derived from high-quality badminton footage. Through an exhaustive evaluation of leading methodologies on this dataset, this study aims to advance the field of action recognition, particularly in badminton sports. The introduction of VideoBadminton could not only serve for badminton action recognition but also provide a dataset for recognizing fine-grained actions. The insights gained from these evaluations are expected to catalyze further research in action comprehension, especially within sports contexts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Is space-time attention all you need for video understanding?. In ICML, Vol. 2. 4.
  2. Sam Careelmont. 2013. Badminton shot classification in compressed video with baseline angled camera. Master esis, University of Ghent (2013).
  3. Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308.
  4. Wei-Ta Chu and Samuel Situmeang. 2017. Badminton video analysis based on spatiotemporal and stroke features. In Proceedings of the 2017 ACM on international conference on multimedia retrieval. 448–451.
  5. MMAction2 Contributors. 2020. OpenMMLab’s Next Generation Video Understanding Toolbox and Benchmark. https://github.com/open-mmlab/mmaction2.
  6. Rescaling egocentric vision. arXiv preprint arXiv:2006.13256 (2020).
  7. Domain adaptation in the context of sport video action recognition. In Domain Adaptation Workshop, in conjunction with NIPS.
  8. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. IEEE, 2625–2634.
  9. Revisiting skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2969–2978.
  10. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF international conference on computer vision. 6202–6211.
  11. Towards Structured Analysis of Broadcast Badminton Videos. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). 296–304. https://doi.org/10.1109/WACV.2018.00039
  12. Georgia Gkioxari and Jitendra Malik. 2015. Finding action tubes. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 759–768.
  13. The” something something” video database for learning and evaluating visual common sense. In Proceedings of the IEEE international conference on computer vision. 5842–5850.
  14. ActivityNet: A large-scale video benchmark for human activity understanding 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  15. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  16. S 2-Labeling: Shot-By-Shot Microscopic Badminton Singles Tactical Dataset. In 2022 23rd Asia-Pacific Network Operations and Management Symposium (APNOMS). IEEE, 1–6.
  17. 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence 35, 1 (2013), 221–231.
  18. THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/.
  19. Attentive spatio-temporal representation learning for diving classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.
  20. Large-scale video classification with convolutional neural networks. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (2014), 1725–1732.
  21. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1725–1732.
  22. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017).
  23. The language of actions: Recovering the syntax and semantics of goal-directed human activities. In Proceedings of the IEEE conference on computer vision and pattern recognition. 780–787.
  24. HMDB: a large video database for human motion recognition. In 2011 International conference on computer vision. IEEE, 2556–2563.
  25. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4 (1989), 541–551.
  26. Mvitv2: Improved multiscale vision transformers for classification and detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4804–4814.
  27. Video swin transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3202–3211.
  28. Moments in time dataset: one million videos for event understanding. IEEE transactions on pattern analysis and machine intelligence 42, 2 (2019), 502–508.
  29. Hamed Pirsiavash and Deva Ramanan. 2012. Detecting activities of daily living in first-person camera views. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2847–2854.
  30. Ronald Poppe. 2010. A survey on vision-based human action recognition. Image and vision computing 28, 6 (2010), 976–990.
  31. Recognition of badminton strokes using dense trajectories. In 7th International Conference on Information and Automation for Sustainability. IEEE, 1–6.
  32. Action mach a spatio-temporal maximum average correlation height filter for action recognition. In 2008 IEEE conference on computer vision and pattern recognition. IEEE, 1–8.
  33. A database for fine grained activity detection of cooking activities. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 1194–1201.
  34. Investigation of upper limb movement during badminton smash. In 2015 10th Asian Control Conference (ASCC). IEEE, 1–6.
  35. FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  36. Hollywood in homes: Crowdsourcing data collection for activity understanding. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 510–526.
  37. Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. Advances in neural information processing systems 27 (2014).
  38. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012).
  39. Going deeper with convolutions. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (2015), 1–9.
  40. Sing Loong Teng and Raveendran Paramesran. 2011. Detection of service activity in a badminton game. In TENCON 2011-2011 IEEE Region 10 Conference. IEEE, 312–315.
  41. Automatic badminton action recognition using RGB-D sensor. In Advanced Materials Research, Vol. 1042. Trans Tech Publ, 89–93.
  42. Potential and limitations of Kinect for badminton performance analysis and profiling. Indian Journal of Science and Technology 9, 45 (2016), 1–5.
  43. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6450–6459.
  44. Attention is all you need. Advances in neural information processing systems 30 (2017).
  45. Shuttlenet: Position-aware fusion of rally progress and player styles for stroke forecasting in badminton. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 4219–4227.
  46. Application of computer vision and vector space model for tactical movement classification in badminton. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 76–82.
  47. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Thirty-second AAAI conference on artificial intelligence.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com