ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation (2403.01345v1)
Abstract: Accurate human shape recovery from a monocular RGB image is a challenging task because humans come in different shapes and sizes and wear different clothes. In this paper, we propose ShapeBoost, a new human shape recovery framework that achieves pixel-level alignment even for rare body shapes and high accuracy for people wearing different types of clothes. Unlike previous approaches that rely on the use of PCA-based shape coefficients, we adopt a new human shape parameterization that decomposes the human shape into bone lengths and the mean width of each part slice. This part-based parameterization technique achieves a balance between flexibility and validity using a semi-analytical shape reconstruction algorithm. Based on this new parameterization, a clothing-preserving data augmentation module is proposed to generate realistic images with diverse body shapes and accurate annotations. Experimental results show that our method outperforms other state-of-the-art methods in diverse body shape situations as well as in varied clothing situations.
- Recovering 3D human pose from monocular images. TPAMI, 28(1): 44–58.
- CLOTH3D: clothed 3d humans. In ECCV, 344–359. Springer.
- BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8726–8737.
- Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In ECCV.
- Pose2mesh: Graph convolutional network for 3d human pose and mesh recovery from a 2d human pose. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, 769–787. Springer.
- Accurate 3D body shape regression using metric and semantic attributes. In CVPR, 2718–2728.
- Monocular expressive body regression through body-driven attention. In ECCV, 20–40. Springer.
- Learned Vertex Descent: A New Direction for 3D Human Model Fitting. In ECCV.
- Learning to regress bodies from images using differentiable semantic rendering. In ICCV, 11250–11259.
- Instaboost: Boosting instance segmentation via probability map guided copy-pasting. In ICCV, 682–691.
- Estimating human shape and pose from a single image. In ICCV, 1381–1388. IEEE.
- Learning an infant body model from RGB-D data for accurate full body motion analysis. In MICCAI, 792–800. Springer.
- Learning to train with synthetic humans. In Pattern Recognition, 609–623. Springer.
- Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. TPAMI.
- Exemplar fine-tuning for 3d human model fitting towards in-the-wild 3d human pose estimation. In 3DV.
- End-to-end recovery of human shape and pose. In CVPR.
- VIBE: Video inference for human body pose and shape estimation. In CVPR.
- PARE: Part attention regressor for 3D human body estimation. In ICCV, 11127–11137.
- Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In ICCV.
- Convolutional mesh regression for single-image human shape reconstruction. In CVPR, 4501–4510.
- NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12933–12942.
- HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery. arXiv preprint arXiv:2304.05690.
- D&D: Learning Human Dynamics from Dynamic Camera. In ECCV.
- Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation. In CVPR, 3383–3393.
- Cliff: Carrying location information in full frames into human pose and shape estimation. In ECCV, 590–606. Springer.
- Shape-aware human pose and shape reconstruction using multi-view images. In ICCV, 4352–4362.
- End-to-end human pose and mesh reconstruction with transformers. In CVPR, 1954–1963.
- Mesh graphormer. In ICCV, 12939–12948.
- Microsoft coco: Common objects in context. In ECCV.
- SMPL: A skinned multi-person linear model. TOG.
- 3D Human Mesh Estimation from Virtual Markers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 534–543.
- AMASS: Archive of motion capture as surface shapes. In ICCV, 5442–5451.
- I2l-meshnet: Image-to-lixel prediction network for accurate 3d human pose and mesh estimation from a single rgb image. In ECCV, 752–768. Springer.
- On self-contact and human pose. In CVPR, 9990–9999.
- Neural body fitting: Unifying deep learning and model based human pose and shape estimation. In 3DV, 484–494. IEEE.
- Star: Sparse trained articulated human body regressor. In ECCV, 598–613. Springer.
- AGORA: Avatars in geography optimized for regression analysis. In CVPR, 13468–13478.
- Expressive body capture: 3d hands, face, and body from a single image. In CVPR.
- Learning to estimate 3D human pose and shape from a single color image. In CVPR, 459–468.
- 3dpeople: Modeling the geometry of dressed humans. In ICCV, 2242–2251.
- Human body measurement estimation with adversarial augmentation. In 2022 International Conference on 3D Vision (3DV), 219–230. IEEE.
- Shape of You: Precise 3D shape estimations for diverse body types. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3519–3523.
- Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild. In British Machine Vision Conference (BMVC).
- Hierarchical kinematic probability distributions for 3D human shape and pose estimation from images in the wild. In ICCV, 11219–11229.
- Probabilistic 3D human shape and pose estimation from multiple unconstrained images in the wild. In CVPR, 16094–16104.
- Self-supervised learning of motion capture. NeurIPS, 30.
- Bodynet: Volumetric inference of 3d human body shapes. In ECCV, 20–36.
- Learning from synthetic humans. In CVPR, 109–117.
- Recovering accurate 3d human pose in the wild using imus and a moving camera. In ECCV.
- Deep high-resolution representation learning for visual recognition. TPAMI, 43(10): 3349–3364.
- InfiniteForm: A synthetic, minimal bias dataset for fitness applications. arXiv preprint arXiv:2110.01330.
- PyMAF-X: Towards well-aligned full-body model regression from monocular images. arXiv preprint arXiv:2207.06400.
- Siyuan Bian (9 papers)
- Jiefeng Li (22 papers)
- Jiasheng Tang (16 papers)
- Cewu Lu (203 papers)