Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HMP: Hand Motion Priors for Pose and Shape Estimation from Video (2312.16737v1)

Published 27 Dec 2023 in cs.CV

Abstract: Understanding how humans interact with the world necessitates accurate 3D hand pose estimation, a task complicated by the hand's high degree of articulation, frequent occlusions, self-occlusions, and rapid motions. While most existing methods rely on single-image inputs, videos have useful cues to address aforementioned issues. However, existing video-based 3D hand datasets are insufficient for training feedforward models to generalize to in-the-wild scenarios. On the other hand, we have access to large human motion capture datasets which also include hand motions, e.g. AMASS. Therefore, we develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions. This motion prior is then employed for video-based 3D hand motion estimation following a latent optimization approach. Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios. It produces stable, temporally consistent results that surpass conventional single-frame methods. We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets, with special emphasis on an occlusion-focused subset of HO3D. Code is available at https://hmp.is.tue.mpg.de

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Deformer: Dynamic fusion transformer for robust hand pose estimation. ArXiv, abs/2303.04991, 2023.
  2. Honnotate: A method for 3D annotation of hand and object poses. In CVPR, pages 3196–3206, 2020.
  3. HO-3D-v3: Improving the accuracy of hand-object annotations of the HO-3D dataset, 2021.
  4. Leveraging photometric consistency over time for sparsely supervised hand-object reconstruction. In CVPR, pages 571–580, 2020.
  5. Learning joint reconstruction of hands and manipulated objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  6. NeMF: Neural motion fields for kinematic animation. In NeurIPS, 2022.
  7. Adam: A method for stochastic optimization. In ICLR, 2014.
  8. Semi-supervised 3D hand-object poses estimation with interactions in time. In CVPR, pages 14687–14697, 2021.
  9. MediaPipe: A framework for building perception pipelines, 2019.
  10. AMASS: Archive of motion capture as surface shapes. In ICCV, 2019.
  11. Handoccnet: Occlusion-robust 3D hand mesh estimation network. In CVPR, pages 1496–1505, 2022.
  12. HuMoR: 3d human motion model for robust pose estimation. In ICCV, 2021.
  13. Pymaf-x: Towards well-aligned full-body model regression from monocular images. IEEE TPAMI, 2023.
  14. On the continuity of rotation representations in neural networks. In CVPR, pages 5745–5753, 2019.
  15. TempCLR: Reconstructing hands via time-coherent contrastive learning. In 3DV, 2022.
Citations (6)

Summary

We haven't generated a summary for this paper yet.