Personalized 3D Human Pose and Shape Refinement (2403.11634v1)
Abstract: Recently, regression-based methods have dominated the field of 3D human pose and shape estimation. Despite their promising results, a common issue is the misalignment between predictions and image observations, often caused by minor joint rotation errors that accumulate along the kinematic chain. To address this issue, we propose to construct dense correspondences between initial human model estimates and the corresponding images that can be used to refine the initial predictions. To this end, we utilize renderings of the 3D models to predict per-pixel 2D displacements between the synthetic renderings and the RGB images. This allows us to effectively integrate and exploit appearance information of the persons. Our per-pixel displacements can be efficiently transformed to per-visible-vertex displacements and then used for 3D model refinement by minimizing a reprojection loss. To demonstrate the effectiveness of our approach, we refine the initial 3D human mesh predictions of multiple models using different refinement procedures on 3DPW and RICH. We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
- https://github.com/vchoutas/smplx/tree/master/transfer_model.
- Learning to reconstruct people in clothing from a single RGB camera. In CVPR, 2019.
- Video based reconstruction of 3d people models. In CVPR, 2018.
- Scape: Shape completion and animation of people. SIGGRAPH, 2005.
- Keep it smpl: Automatic estimation of 3d human pose and shape from a single image. In ECCV, 2016.
- Albumentations: Fast and flexible image augmentations. Information, 11(2), 2020.
- Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019.
- Benchmarks for corruption invariant person re-identification. In NeurIPS, 2021.
- Learning to fit morphable models. In ECCV, 2022.
- Adversarial parametric pose prior. In CVPR, 2022.
- Shape-aware multi-person pose estimation from multi-view images. In ICCV, 2021.
- Learning to regress bodies from images using differentiable semantic rendering. In ICCV, 2021.
- Revitalizing optimization for 3d human pose and shape estimation: A sparse constrained formulation. ICCV, 2021.
- Flownet: Learning optical flow with convolutional networks. CVPR, 2015.
- Statistical methods for tomographic image reconstruction. 1987.
- Geometric correspondence fields: Learned differentiable rendering for 3d pose refinement in the wild. In ECCV, 2020.
- Estimating human shape and pose from a single image. In ICCV, 2009.
- Holopose: Holistic 3d human reconstruction in-the-wild. In CVPR, 2019.
- Densepose: Dense human pose estimation in the wild. In CVPR, 2018.
- Multilinear pose and body shape estimation of dressed subjects from image sets. In CVPR, pages 1823–1830, 2010.
- Deep residual learning for image recognition. In CVPR, 2016.
- Capturing and inferring dense full-body human-scene contact. In CVPR, 2022.
- Towards accurate marker-less human shape and pose estimation over time. In International Conference on 3D Vision (3DV), 2017.
- FlowFormer: A transformer architecture for optical flow. ECCV, 2022.
- Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 36(7), 2014.
- Rana: Relightable articulated neural avatars. arXiv preprint arXiv:2212.03237, 2022.
- Exemplar fine-tuning for 3d human pose fitting towards in-the-wild 3d human pose estimation. In 3DV, 2020.
- End-to-end recovery of human shape and pose. In CVPR, 2018.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
- Beyond weak perspective for monocular 3d human pose estimation. In ECCVW, 2020.
- PARE: Part attention regressor for 3D human body estimation. In ICCV, 2021.
- SPEC: Seeing people in the wild with an estimated camera. In ICCV, 2021.
- Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In ICCV, 2019.
- Probabilistic modeling for human mesh recovery. In ICCV, 2021.
- Modular primitives for high-performance differentiable rendering. ACM Transactions on Graphics, 39(6), 2020.
- Unite the people: Closing the loop between 3d and 2d human representations. In CVPR, 2017.
- NIKI: Neural inverse kinematics with invertible neural networks for 3d human pose and shape estimation. In CVPR, 2023.
- Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation. In CVPR, pages 3383–3393, 2021.
- Cliff: Carrying location information in full frames into human pose and shape estimation. In ECCV, 2022.
- 3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-View Model-Fitting. In WACV, 2021.
- SMPL: A skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia), 2015.
- On self-contact and human pose. In CVPR, 2021.
- Stacked hourglass networks for human pose estimation. ECCV, 2016.
- Neural body fitting: Unifying deep learning and model-based human pose and shape estimation. In 3DV, 2018.
- Benchmarking and analyzing 3d human pose and shape estimation beyond algorithms. In NeurIPS, 2022.
- Expressive body capture: 3d hands, face, and body from a single image. In CVPR, 2019.
- Texturepose: Supervising human mesh estimation with texture consistency. In ICCV, 2019.
- Learning to estimate 3d human pose and shape from a single color image. In CVPR, pages 459–468, 2018.
- Deepcut: Joint subset partition and labeling for multi person pose estimation. In CVPR, 2016.
- Tracking people with 3d representations. Advances in Neural Information Processing Systems, 34:23703–23713, 2021.
- Variational inference with normalizing flows. In Proceedings of Machine Learning Research (PMLR), 2015.
- Learning monocular 3d human pose estimation from multi-view images. CVPR, 2018.
- Synthetic training for accurate 3d human pose and shape estimation in the wild. In British Machine Vision Conference (BMVC), 2020.
- Bilevel online adaptation for out-of-domain human mesh reconstruction. In CVPR, 2021.
- Combined discriminative and generative articulated pose and non-rigid shape estimation. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems, volume 20. Curran Associates, Inc., 2007.
- Human body model fitting by learned gradient descent. In ECCV, 2020.
- Esteban G. Tabak and Cristina V. Turner. A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2), 2013.
- Esteban G. Tabak and Eric Vanden-Eijnden. Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences, 8(1), 2010.
- GRAB: A dataset of whole-body human grasping of objects. In ECCV, 2020.
- Raft: Recurrent all-pairs field transforms for optical flow. In ECCV, 2020.
- Pose-ndf: Modeling human pose manifolds with neural distance fields. In ECCV, 2022.
- Self-supervised learning of motion capture. In Advances in Neural Information Processing Systems 30, 2017.
- Learning from synthetic humans. In CVPR, 2017.
- Indirect deep structured learning for 3d human body shape and pose prediction. In Proceedings of the British Machine Vision Conference (BMVC), 2017.
- Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In ECCV, 2018.
- Probabilistic monocular 3d human pose estimation with normalizing flows. In ICCV, 2021.
- Capturing humans in motion: Temporal-attentive 3d human pose and shape estimation from monocular video. In CVPR, June 2022.
- Monocular total capture: Posing face, body, and hands in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
- Ghum & ghuml: Generative 3d human shape and articulated pose models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6184–6193, 2020.
- Gmflow: Learning optical flow via global matching. In CVPR, 2022.
- 3D human texture estimation from a single image with transformers. In Proceedings of the IEEE International Conference on Computer Vision, 2021.
- Denserac: Joint 3d pose and shape estimation by dense render-and-compare. ICCV, 2019.
- Learning visibility for robust dense human body estimation. In ECCV, 2022.
- Weakly supervised 3d human pose and shape reconstruction with normalizing flows. In ECCV, 2020.
- Neural descent for visual 3d human pose and shape. In CVPR, 2021.
- Monocular 3d pose and shape estimation of multiple people in natural scenes: The importance of multiple scene constraints. In CVPR, 2018.
- 3d human mesh regression with dense correspondence. In CVPR, 2020.
- Learning 3d human shape and pose from dense body parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5):2610–2627, 2020.
- Pymaf: 3d human pose and shape regression with pyramidal mesh alignment feedback loop. In ICCV, 2021.
- Object-occluded human shape and pose estimation from a single color image. In CVPR, 2020.