Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SwimXYZ: A large-scale dataset of synthetic swimming motions and videos (2310.04360v1)

Published 6 Oct 2023 in cs.CV and cs.GR

Abstract: Technologies play an increasingly important role in sports and become a real competitive advantage for the athletes who benefit from it. Among them, the use of motion capture is developing in various sports to optimize sporting gestures. Unfortunately, traditional motion capture systems are expensive and constraining. Recently developed computer vision-based approaches also struggle in certain sports, like swimming, due to the aquatic environment. One of the reasons for the gap in performance is the lack of labeled datasets with swimming videos. In an attempt to address this issue, we introduce SwimXYZ, a synthetic dataset of swimming motions and videos. SwimXYZ contains 3.4 million frames annotated with ground truth 2D and 3D joints, as well as 240 sequences of swimming motions in the SMPL parameters format. In addition to making this dataset publicly available, we present use cases for SwimXYZ in swimming stroke clustering and 2D pose estimation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Jenna Ng. Seeing movement: On motion capture animation and james cameron’s avatar. Animation, 7, 2012.
  2. DeepPhase: Periodic autoencoders for learning motion phase manifolds. ACM Transactions on Graphics (TOG), 41, 2022.
  3. Character controllers using motion vaes. ACM Transactions on Graphics (TOG), 39, 2020.
  4. Comparative abilities of microsoft kinect and vicon 3d motion capture for gait analysis. Journal of medical engineering & technology, 38, 2014.
  5. Body image disturbances and weight bias after obesity surgery: Semantic and visual evaluation in a controlled study, findings from the bodytalk project. Obesity Surgery, 31, 2021.
  6. Activity-conditioned continuous human pose estimation for performance analysis of athletes using the example of swimming. In Winter conference on Applications of Computer Vision (WACV), 2018.
  7. Predicting athlete ground reaction forces and moments from spatio-temporal driven cnn models. IEEE Transactions on Biomedical Engineering, 66, 2019.
  8. Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance. In ACM International Conference on Multimedia (ACMMM), 2019.
  9. Biomechanical analysis of the “waiter’s serve” on upper limb loads in young elite tennis players. European Journal of Sport Science, 19, 2019.
  10. Automated quantification of the landing error scoring system with a markerless motion-capture system. Journal of Athletic Training, 52, 2017.
  11. Learning soccer juggling skills with layer-wise mixture-of-experts. In ACM SIGGRAPH Conference Proceedings, 2022.
  12. ECON: Explicit Clothed humans Optimized via Normal integration. In Computer Vision and Pattern Recognition (CVPR), 2023.
  13. Cliff: Carrying location information in full frames into human pose and shape estimation. In European Conference on Computer Vision (ECCV), 2022.
  14. Pare: Part attention regressor for 3d human body estimation. In International Conference on Computer Vision (ICCV), 2021.
  15. Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In International Conference on Computer Vision (ICCV), 2019.
  16. Learned vertex descent: a new direction for 3d human model fitting. In European Conference on Computer Vision (ECCV), 2022.
  17. Vibe: Video inference for human body pose and shape estimation. In Computer Vision and Pattern Recognition (CVPR), 2020.
  18. AGORA: Avatars in geography optimized for regression analysis. In Computer Vision and Pattern Recognition (CVPR), 2021.
  19. BEDLAM: A synthetic dataset of bodies exhibiting detailed lifelike animated motion. In Computer Vision and Pattern Recognition (CVPR), 2023.
  20. Learning from synthetic humans. In Computer Vision and Pattern Recognition (CVPR), 2017.
  21. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics (TOG), 34, 2015.
  22. Ganimator: Neural motion synthesis from a single sequence. ACM Transactions on Graphics (TOG), 41, 2022.
  23. Code This Lab. White swimmer 10686 tris. https://assetstore.unity.com/packages/3d/characters/white-swimmer-10686-tris-39121, 2018.
  24. A simple yet effective baseline for 3d human pose estimation. In International Conference on Computer Vision (ICCV), 2017.
  25. Learning to fuse 2d and 3d image cues for monocular body pose estimation. In International Conference on Computer Vision (ICCV), 2017.
  26. Sparseness meets deepness: 3d human pose estimation from monocular video. In Computer Vision and Pattern Recognition (CVPR), 2016.
  27. Occlusion-aware networks for 3d human pose estimation in video. In International Conference on Computer Vision (ICCV), 2019.
  28. Humor: 3d human motion model for robust pose estimation. In International Conference on Computer Vision (ICCV), 2021.
  29. Pose-ndf: Modeling human pose manifolds with neural distance fields. In European Conference on Computer Vision (ECCV), 2022.
  30. Expressive body capture: 3d hands, face, and body from a single image. In Computer Vision and Pattern Recognition (CVPR), 2019.
  31. Ghum & ghuml: Generative 3d human shape and articulated pose models. In Computer Vision and Pattern Recognition (CVPR), 2020.
  32. Realtime multi-person 2d pose estimation using part affinity fields. In Computer Vision and Pattern Recognition (CVPR), 2017.
  33. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In European Conference on Computer Vision (ECCV), 2016.
  34. Exploiting temporal context for 3d human pose estimation in the wild. In Computer Vision and Pattern Recognition (CVPR), 2019.
  35. End-to-end recovery of human shape and pose. In Computer Vision and Pattern Recognition (CVPR), 2018.
  36. Learning 3d human dynamics from video. In Computer Vision and Pattern Recognition (CVPR), 2019.
  37. Exemplar fine-tuning for 3d human model fitting towards in-the-wild 3d human pose estimation. In International Conference on 3D Vision (3DV), 2021.
  38. Guido Ascenso. Development of a non-invasive motion capture system for swimming biomechanics. PhD thesis, Manchester Metropolitan University, 2021.
  39. Swimmernet: Underwater 2d swimmer pose estimation exploiting fully convolutional neural networks. Sensors, 23, 2023.
  40. Pose estimation for deriving kinematic parameters of competitive swimmers. Electronic Imaging, 29, 2017.
  41. Swimmer detection and pose estimation for continuous stroke-rate determination. In Multimedia on Mobile Devices 2012; and Multimedia Content Access: Algorithms and Systems VI, volume 8304, 2012.
  42. Sports videos in the wild (svw): A video dataset for sports analysis. In International Conference on Automatic Face and Gesture Recognition (FG), 2015.
  43. Digital analysis and visualization of swimming motion. International Journal of Virtual Reality, 10, 2011.
  44. Markerless analysis of front crawl swimming. Journal of biomechanics, 44, 2011.
  45. Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Transactions on pattern analysis and machine intelligence, 36, 2013.
  46. Panoptic studio: A massively multiview system for social motion capture. In International Conference on Computer Vision (ICCV), 2015.
  47. Recovering accurate 3d human pose in the wild using imus and a moving camera. In European Conference on Computer Vision (ECCV), 2018.
  48. Behave: Dataset and method for tracking human object interactions. In Computer Vision and Pattern Recognition (CVPR), 2022.
  49. Microsoft coco: Common objects in context. In European Conference on Computer Vision (ECCV), 2014.
  50. Playing for 3d human recovery. arXiv preprint arXiv:2110.07588, 2021.
  51. AMASS: Archive of motion capture as surface shapes. In International Conference on Computer Vision (ICCV), 2019.
  52. Clustered pose and nonlinear appearance models for human pose estimation. In British Machine Vision Conference (BMVC), 2010.
  53. Aspset: An outdoor sports pose video dataset with 3d keypoint annotations. Image and Vision Computing, 111, 2021.
  54. From actemes to action: A strongly-supervised representation for detailed action understanding. In Computer Vision and Pattern Recognition (CVPR), 2013.
  55. Flag3d: A 3d fitness activity dataset with language instruction. In Computer Vision and Pattern Recognition (CVPR), 2023.
  56. Aifit: Automatic 3d human-interpretable feedback models for fitness training. In Computer Vision and Pattern Recognition (CVPR), 2021.
  57. Multi-person extreme motion prediction. In Computer Vision and Pattern Recognition (CVPR), 2022.
  58. Cimi4d: A large multimodal climbing motion dataset under human-scene interactions. In Computer Vision and Pattern Recognition (CVPR), 2023.
  59. Martial arts, dancing and sports dataset: A challenging stereo and multi-view dataset for 3d human pose estimation. Image and Vision Computing, 61, 2017.
  60. An annotated data set for pose estimation of swimmers. 2009.
  61. Timothy Woinoski. Towards automated swimming analytics using deep neural networks. In SFU Undergraduate Research Symposium Journal, volume 1, 2020.
  62. Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS), 2014.
  63. Embodied hands: Modeling and capturing hands and bodies together. ACM Transactions on Graphics, (TOG), 36, 2017.
  64. Home of the blender project - free and open 3d creation software. https://www.blender.org/, 2002.
  65. Unity. https://unity.com/fr, 2005.
  66. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9, 2008.
  67. Automatic swimming activity recognition and lap time assessment based on a single imu: a deep learning approach. Sensors, 22, 2022.
  68. Vitpose: Simple vision transformer baselines for human pose estimation. Advances in Neural Information Processing Systems (NeurIPS), 2022.
  69. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.