On Self-Contact and Human Pose (2104.03176v2)

Published 7 Apr 2021 in cs.CV

Abstract: People touch their face 23 times an hour, they cross their arms and legs, put their hands on their hips, etc. While many images of people contain some form of self-contact, current 3D human pose and shape (HPS) regression methods typically fail to estimate this contact. To address this, we develop new datasets and methods that significantly improve human pose estimation with self-contact. First, we create a dataset of 3D Contact Poses (3DCP) containing SMPL-X bodies fit to 3D scans as well as poses from AMASS, which we refine to ensure good contact. Second, we leverage this to create the Mimic-The-Pose (MTP) dataset of images, collected via Amazon Mechanical Turk, containing people mimicking the 3DCP poses with selfcontact. Third, we develop a novel HPS optimization method, SMPLify-XMC, that includes contact constraints and uses the known 3DCP body pose during fitting to create near ground-truth poses for MTP images. Fourth, for more image variety, we label a dataset of in-the-wild images with Discrete Self-Contact (DSC) information and use another new optimization method, SMPLify-DC, that exploits discrete contacts during pose optimization. Finally, we use our datasets during SPIN training to learn a new 3D human pose regressor, called TUCH (Towards Understanding Contact in Humans). We show that the new self-contact training data significantly improves 3D human pose estimates on withheld test data and existing datasets like 3DPW. Not only does our method improve results for self-contact poses, but it also improves accuracy for non-contact poses. The code and data are available for research purposes at https://tuch.is.tue.mpg.de.

Citations (95)

View on Semantic Scholar

Summary

The paper introduces innovative datasets (3DCP, MTP, DSC) to capture and annotate self-contact in human poses.
The study presents new optimization techniques, including SMPLify extensions and the TUCH regressor, that significantly improve estimation accuracy.
The methodologies combine pose mimicking and discrete contact annotations to enhance applications in animation, VR, and human-computer interaction.

Paper Overview: On Self-Contact and Human Pose

The paper "On Self-Contact and Human Pose" by M\"uller et al. advances the paper of 3D human pose estimation, particularly focusing on scenarios where self-contact is present. Traditional methods for human pose and shape regression often fail when self-contact occurs, such as when one touches their face or crosses their arms. This research addresses this gap by proposing novel datasets and methodologies to improve the accuracy of 3D human pose estimation in self-contact scenarios.

Key Contributions

New Datasets for Self-Contact: The authors introduce three innovative datasets. The 3D Contact Poses (3DCP) dataset consists of SMPL-X bodies fitted to 3D scans that accurately capture self-contact scenarios. In parallel, the Mimic-The-Pose (MTP) dataset includes images collected through Amazon Mechanical Turk, where participants imitate detailed 3D poses with self-contact. Lastly, the Discrete Self-Contact (DSC) dataset comprises in-the-wild images annotated with discrete self-contact information.
New Optimization Methods: Two optimization-based methods, SMPLify-XMC and SMPLify-DC, were developed. SMPLify-XMC leverages known contact information to optimize human poses for MTP images, producing poses close to the ground truth. SMPLify-DC applies discrete contact labels to improve pose estimation in real-world images.
TUCH: A New Regressor for Self-Contact: A new regression framework, TUCH (Towards Understanding Contact in Humans), was trained using the aforementioned datasets to accurately estimate human poses with self-contact. TUCH showed improved performance in estimating 3D human poses over existing methods, including SPIN, particularly in scenarios with self-contact.

Methodology Insights

Self-Contact Definition and Detection: The paper defines self-contact in terms of close Euclidean proximity between vertices that are geodesically far apart on the human mesh. This definition ensures meaningful contact detection, excluding usual contacts like armpits or crotch areas.

Pose Mimicking Strategy: To overcome the scarcity of annotated datasets, the authors innovated a pose-mimicking strategy using MTurk contributors. This approach bridges the gap between controlled 3D poses and real-world applications where contact poses are involved.

SMPLify Extensions: The advancements in SMPLify, particularly incorporating knowledge of self-contact, significantly mitigate errors in existing methods. These modifications are crucial for producing accurate self-contact pose estimations from images.

Training and Annotations: By combining both direct supervision from MTP pseudo ground-truth and optimization from DSC annotations, TUCH achieves a robust understanding of self-contact dynamics. Notably, self-contact data improves general pose estimation, demonstrating the utility of contact information in reducing pose ambiguity.

Implications and Future Directions

This work has significant applied and theoretical implications. Practically, it enhances applications in animation, virtual reality, and human-computer interaction, where accurate human pose estimation is vital. Theoretically, it encourages the design of algorithms that can interpret subtle interactions and contact dynamics.

For future research, there is potential to explore:

Extending the datasets with diverse demographics and environments to cover a broader range of self-contact scenarios.
Integrating motion dynamics alongside static pose estimation to capture contact over time, enhancing applications in dynamic environments.
Exploring cross-modal datasets, blending visual cues with textual or sensor data, to strengthen the understanding of human self-contact behaviors in AI systems.

In conclusion, the advancements presented in this paper address a critical limitation in 3D human pose estimation by incorporating self-contact awareness, leading to more accurate and realistic human modeling. The methodologies and insights presented can serve as a foundation for further exploration and development in the field of human pose estimation.

PDF Markdown

Related Papers

YouTube

Show All Videos