Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced Three-Point Trackers (2402.09211v1)

Published 14 Feb 2024 in cs.CV and cs.AI

Abstract: Full-body avatar presence is crucial for immersive social and environmental interactions in digital reality. However, current devices only provide three six degrees of freedom (DOF) poses from the headset and two controllers (i.e. three-point trackers). Because it is a highly under-constrained problem, inferring full-body pose from these inputs is challenging, especially when supporting the full range of body proportions and use cases represented by the general population. In this paper, we propose a deep learning framework, DivaTrack, which outperforms existing methods when applied to diverse body sizes and activities. We augment the sparse three-point inputs with linear accelerations from Inertial Measurement Units (IMU) to improve foot contact prediction. We then condition the otherwise ambiguous lower-body pose with the predictions of foot contact and upper-body pose in a two-stage model. We further stabilize the inferred full-body pose in a wide range of configurations by learning to blend predictions that are computed in two reference frames, each of which is designed for different types of motions. We demonstrate the effectiveness of our design on a large dataset that captures 22 subjects performing challenging locomotion for three-point tracking, including lunges, hula-hooping, and sitting. As shown in a live demo using the Meta VR headset and Xsens IMUs, our method runs in real-time while accurately tracking a user's motion when they perform a diverse set of movements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. “FLAG: Flow-based 3D Avatar Generation from Sparse Observations” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
  2. “Coolmoves: User motion accentuation in virtual reality” In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5.2 ACM New York, NY, USA, 2021, pp. 1–23
  3. “ControllerPose: Inside-Out Body Capture with VR Controller Cameras” In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22 New Orleans, LA, USA: Association for Computing Machinery, 2022
  4. “UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture” In European Conference on Computer Vision (ECCV), 2022
  5. “Empirical evaluation of gated recurrent neural networks on sequence modeling” In arXiv preprint arXiv:1412.3555, 2014
  6. Jinxiang Chai and Jessica K Hodgins “Performance animation from low-dimensional control signals” In ACM SIGGRAPH 2005 Papers, 2005, pp. 686–696
  7. “Estimating running spatial and temporal parameters using an inertial sensor” In Sports Engineering 21 Springer, 2018, pp. 115–122
  8. “SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation”, 2023 arXiv:2309.17448 [cs.CV]
  9. “Low-pass filter cutoff frequency affects sacral-mounted inertial measurement unit estimations of peak vertical ground reaction force and contact time during treadmill running” In Journal of Biomechanics 119 Elsevier, 2021, pp. 110323
  10. “Full-Body Motion from a Single Head-Mounted Device: Generating SMPL Poses from Partial Observations” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 11687–11697
  11. “Avatars grow legs: Generating smooth human motion from sparse tracking inputs with diffusion model” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 481–490
  12. “MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis” In Computer Vision and Pattern Recognition (CVPR), 2023
  13. “SlowFast Networks for Video Recognition” In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
  14. Ylva Ferstl, Michael Neff and Rachel McDonnell “Multi-Objective Adversarial Gesture Generation” In Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, MIG ’19 Newcastle upon Tyne, United Kingdom: Association for Computing Machinery, 2019
  15. “Foot strike pattern differently affects the axial and transverse components of shock acceleration and attenuation in downhill trail running” In Journal of biomechanics 49.9 Elsevier, 2016, pp. 1765–1771
  16. “Style-based inverse kinematics” In ACM SIGGRAPH 2004 Papers, 2004, pp. 522–531
  17. “Human poseitioning system (hps): 3d human pose estimation and self-localization in large scenes from body-mounted sensors” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4318–4329
  18. “Generative Adversarial Nets” In Advances in Neural Information Processing Systems 27 Curran Associates, Inc., 2014
  19. “Humans in 4D: Reconstructing and Tracking Humans with Transformers” In International Conference on Computer Vision (ICCV), 2023
  20. Gustav Eje Henter, Simon Alexanderson and Jonas Beskow “Moglow: Probabilistic and controllable motion synthesis using normalising flows” In ACM Transactions on Graphics (TOG) 39.6 ACM New York, NY, USA, 2020, pp. 1–14
  21. “Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time” In ACM Transactions on Graphics (TOG) 37.6 ACM New York, NY, USA, 2018, pp. 1–15
  22. “Long short-term memory” In Neural computation 9.8 MIT Press, 1997, pp. 1735–1780
  23. “Learning motion manifolds with convolutional autoencoders” In SIGGRAPH Asia 2015 technical briefs, 2015, pp. 1–4
  24. “EgoPoser: Robust Real-Time Ego-Body Pose Estimation in Large Scenes” In arXiv preprint arXiv:2308.06493, 2023
  25. “Avatarposer: Articulated full-body pose tracking from sparse motion sensing” In European Conference on Computer Vision, 2022, pp. 443–460 Springer
  26. “Transformer Inertial Poser: Real-time Human Motion Reconstruction from Sparse IMUs with Simultaneous Terrain Generation” In SIGGRAPH Asia 2022 Conference Papers, 2022, pp. 1–9
  27. “EgoHumans: An Egocentric 3D Multi-Human Benchmark” In International Conference on Computer Vision (ICCV), 2023
  28. Diederik P. Kingma and Max Welling “Auto-Encoding Variational Bayes” In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014 arXiv:http://arxiv.org/abs/1312.6114v10 [stat.ML]
  29. “Learning 3d human dynamics from video” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5614–5623
  30. “GANimator: Neural Motion Synthesis from a Single Sequence” In ACM Trans. Graph. 41.4 New York, NY, USA: Association for Computing Machinery, 2022
  31. “Scene-Aware 3D Multi-Human Motion Capture from a Single Camera” In Computer Graphics Forum 42.2, 2023, pp. 371–383 DOI: https://doi.org/10.1111/cgf.14768
  32. “Dynamics-regulated kinematic policy for egocentric pose estimation” In Advances in Neural Information Processing Systems 34, 2021, pp. 25019–25032
  33. “BodyTrak: Inferring Full-Body Poses from Body Silhouettes Using a Miniature Camera on a Wristband” In Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6.3 New York, NY, USA: Association for Computing Machinery, 2022
  34. Jiaman Li, Karen Liu and Jiajun Wu “Ego-Body Pose Estimation via Ego-Head Pose Estimation” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17142–17151
  35. James B Lee, Rebecca B Mellifont and Brendan J Burkett “The use of a single inertial sensor to identify stride, step, and stance durations of running gait” In Journal of Science and Medicine in Sport 13.2 Elsevier, 2010, pp. 270–273
  36. “QuestEnvSim: Environemnt-aware Simulated Motion Tracking From Sparse Sensors” In SIGGRAPH Conference, 2023
  37. “Continuous character control with low-dimensional embeddings” In ACM Transactions on Graphics (TOG) 31.4 ACM New York, NY, USA, 2012, pp. 1–10
  38. “Learning to Generate Diverse Dance Motions with Transformer”, 2020
  39. “Dancing to Music” In Advances in Neural Information Processing Systems 32 Curran Associates, Inc., 2019 URL: https://proceedings.neurips.cc/paper/2019/file/7ca57a9f85a19a6e4b9a248c1daca185-Paper.pdf
  40. “Character controllers using motion vaes” In ACM Transactions on Graphics (TOG) 39.4 ACM New York, NY, USA, 2020, pp. 40–1
  41. “Human motion estimation from a reduced marker set” In Proceedings of the 2006 symposium on Interactive 3D graphics and games, 2006, pp. 35–42
  42. “IMUPoser: Full-Body Pose Estimation Using IMUs in Phones, Watches, and Earbuds” In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23 Hamburg, Germany: Association for Computing Machinery, 2023 DOI: 10.1145/3544548.3581392
  43. “Motion Graphs++: A Compact Generative Model for Semantic Motion Analysis and Synthesis” In ACM Trans. Graph. 31.6 New York, NY, USA: Association for Computing Machinery, 2012
  44. “AMASS: Archive of Motion Capture as Surface Shapes” In International Conference on Computer Vision, 2019, pp. 5442–5451
  45. Rolf Moe-Nilssen “A new method for evaluating motor control in gait under real-life environmental conditions. Part 1: The instrument” In Clinical biomechanics 13.4-5 Elsevier, 1998, pp. 320–327
  46. Nicholas Milef, Shinjiro Sueda and Nima Khademi Kalantari “Variational Pose Prediction with Dynamic Sample Selection from Sparse Tracking Signals” In Computer Graphics Forum The Eurographics AssociationJohn Wiley & Sons Ltd., 2023 DOI: 10.1111/cgf.14767
  47. “Towards Robust Direction Invariance in Character Animation” In Computer Graphics Forum 38.7, 2019, pp. 235–242
  48. “An RNN-Ensemble Approach for Real Time Human Pose Estimation from Sparse IMUs” In Proceedings of the 3rd International Conference on Applications of Intelligent Systems, APPIS 2020 Las Palmas de Gran Canaria, Spain: Association for Computing Machinery, 2020
  49. Mathis Petrovich, Michael J Black and Gül Varol “Action-conditioned 3d human motion synthesis with transformer vae” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10985–10995
  50. “A single sacral-mounted inertial measurement unit to estimate peak vertical ground reaction force, contact time, and flight time in running” In Sensors 22.3 MDPI, 2022, pp. 784
  51. “Combining Motion Matching and Orientation Prediction to Animate Avatars for Consumer-Grade VR Devices” In Computer Graphics Forum The Eurographics AssociationJohn Wiley & Sons Ltd., 2022
  52. Daniel Roetenberg, Henk Luinge and Per Johan Slycke “Xsens MVN: Full 6DOF Human Motion Tracking Using Miniature Inertial Sensors”, 2008
  53. Yu Rong, Takaaki Shiratori and Hanbyul Joo “Frankmocap: A monocular 3d whole-body pose estimation system via regression and integration” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1749–1759
  54. “Physcap: Physically plausible monocular 3d motion capture in real time” In ACM Transactions on Graphics (ToG) 39.6 ACM New York, NY, USA, 2020, pp. 1–16
  55. Alla Safonova, Jessica K Hodgins and Nancy S Pollard “Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces” In ACM Transactions on Graphics (ToG) 23.3 ACM New York, NY, USA, 2004, pp. 514–521
  56. Kihyuk Sohn, Honglak Lee and Xinchen Yan “Learning structured output representation using deep conditional generative models” In Advances in neural information processing systems 28, 2015
  57. Sebastian Starke, Ian Mason and Taku Komura “DeepPhase: periodic autoencoders for learning motion phase manifolds” In ACM Transactions on Graphics (TOG) 41.4 ACM New York, NY, USA, 2022, pp. 1–13
  58. Mike Schuster and Kuldip K. Paliwal “Bidirectional recurrent neural networks” In IEEE Transactions on Signal Processing 45, 1997, pp. 2673–2681
  59. “Human motion diffusion as a generative prior” In arXiv preprint arXiv:2303.01418, 2023
  60. “Neural state machine for character-scene interactions.” In ACM Trans. Graph. 38.6, 2019, pp. 209–1
  61. “Local Motion Phases for Learning Multi-Contact Character Movements” In ACM Trans. Graph. 39.4 New York, NY, USA: Association for Computing Machinery, 2020
  62. “Pose-ndf: Modeling human pose manifolds with neural distance fields” In European Conference on Computer Vision, 2022, pp. 572–589 Springer
  63. “Selfpose: 3d egocentric pose estimation from a headset mounted camera” In Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2020
  64. Jonathan Tseng, Rodrigo Castellon and C Karen Liu “EDGE: Editable Dance Generation From Music” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
  65. “Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors” In British Machine Vision Conference 2017, BMVC 2017, London, UK, September 4-7, 2017 BMVA Press, 2017
  66. “Human Motion Diffusion Model” In ICLR, 2023
  67. “Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs” In Computer Graphics Forum, the 38th Annual Conference of the European Association for Computer Graphics 36.2, 2017, pp. 349–360
  68. “Attention is all you need” In Advances in neural information processing systems 30, 2017
  69. Zhiyong Wang, Jinxiang Chai and Shihong Xia “Combining Recurrent Neural Networks and Adversarial Training for Human Motion Synthesis and Control” In IEEE Transactions on Visualization and Computer Graphics 27.1 USA: IEEE Educational Activities Department, 2021, pp. 14–28
  70. Jack M. Wang, David J. Fleet and Aaron Hertzmann “Gaussian Process Dynamical Models for Human Motion” In IEEE Transactions on Pattern Analysis and Machine Intelligence 30.2, 2008, pp. 283–298 DOI: 10.1109/TPAMI.2007.1167
  71. “Scene-aware Egocentric 3D Human Pose Estimation” In CVPR, 2023
  72. Alexander Winkler, Jungdam Won and Yuting Ye “QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars” In SIGGRAPH Asia 2022 Conference Papers, 2022, pp. 1–8
  73. Dongseok Yang, Doyeon Kim and Sung-Hee Lee “Lobstr: Real-time lower-body pose prediction from sparse upper-body tracking signals” In Computer Graphics Forum 40.2, 2021, pp. 265–275 Wiley Online Library
  74. “Neural3Points: Learning to Generate Physically Realistic Full-body Motion for Virtual Reality Users” In Computer Graphics Forum 41.8, 2022, pp. 183–194
  75. “Decoupling Human and Camera Motion from Videos in the Wild” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
  76. “Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13167–13178
  77. “EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors” In ACM Transactions on Graphics (TOG) 42.4 ACM, 2023
  78. Xinyu Yi, Yuxiao Zhou and Feng Xu “TransPose: real-time 3D human translation and pose estimation with six inertial sensors” In ACM Transactions on Graphics (TOG) 40.4 ACM New York, NY, USA, 2021, pp. 1–13
  79. “On the Continuity of Rotation Representations in Neural Networks” In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
  80. “MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model” In arXiv preprint arXiv:2208.15001, 2022
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Dongseok Yang (7 papers)
  2. Jiho Kang (3 papers)
  3. Lingni Ma (19 papers)
  4. Joseph Greer (1 paper)
  5. Yuting Ye (38 papers)
  6. Sung-Hee Lee (15 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets