Probabilistic Modeling for Human Mesh Recovery (2108.11944v1)

Published 26 Aug 2021 in cs.CV

Abstract: This paper focuses on the problem of 3D human reconstruction from 2D evidence. Although this is an inherently ambiguous problem, the majority of recent works avoid the uncertainty modeling and typically regress a single estimate for a given input. In contrast to that, in this work, we propose to embrace the reconstruction ambiguity and we recast the problem as learning a mapping from the input to a distribution of plausible 3D poses. Our approach is based on the normalizing flows model and offers a series of advantages. For conventional applications, where a single 3D estimate is required, our formulation allows for efficient mode computation. Using the mode leads to performance that is comparable with the state of the art among deterministic unimodal regression models. Simultaneously, since we have access to the likelihood of each sample, we demonstrate that our model is useful in a series of downstream tasks, where we leverage the probabilistic nature of the prediction as a tool for more accurate estimation. These tasks include reconstruction from multiple uncalibrated views, as well as human model fitting, where our model acts as a powerful image-based prior for mesh recovery. Our results validate the importance of probabilistic modeling, and indicate state-of-the-art performance across a variety of settings. Code and models are available at: https://www.seas.upenn.edu/~nkolot/projects/prohmr.

Citations (160)

View on Semantic Scholar

Summary

The paper proposes a novel probabilistic approach using Conditional Normalizing Flows to address the inherent ambiguity in recovering 3D human meshes from 2D images.
Their method's mode of the learned distribution achieves performance comparable to state-of-the-art deterministic models and significantly improves accuracy when used as a prior with additional cues like multi-view images.
The probabilistic model is flexible, applicable at test-time without task-specific training, and adaptable to related problems such as lifting 2D poses to 3D skeletons.

Probabilistic Modeling for Human Mesh Recovery

In the paper "Probabilistic Modeling for Human Mesh Recovery," the authors delve into the challenge of reconstructing 3D human poses from 2D images, acknowledging the intrinsic ambiguity associated with such a task. Most existing methods tend to provide a single deterministic estimate for a given input, perhaps due to ease of evaluation on standard benchmarks and applicability. However, this paper offers a novel perspective by proposing a probabilistic approach that embraces reconstruction ambiguity, seeking to learn a mapping from 2D inputs to a distribution of plausible 3D poses.

The primary methodological innovation introduced is the utilization of Conditional Normalizing Flows. This approach represents a departure from traditional techniques, offering several advantages including efficient computation of sample likelihoods and mode estimation within the distribution. The paper emphasizes that the mode of the distribution can be computed in a closed form and, in conventional scenarios requiring a single 3D estimate, this method achieves performance on par with state-of-the-art unimodal regression models.

The implications of this work are particularly significant in domains where additional input cues are available, such as multiple uncalibrated views or 2D keypoints. The probabilistic nature of the model is harnessed in these settings, acting as an image-based prior for mesh recovery and enabling improved accuracy by integrating diverse sources of evidence. The model's flexibility allows application at test-time without necessitating task-specific training, enhancing its utility in practical applications.

Quantitative evaluations on datasets such as 3DPW, Human3.6M, and MPI-INF-3DHP demonstrate the effectiveness of this approach, with the probabilistic model matching and, in some cases, exceeding the performance of existing deterministic methods. Furthermore, the paper reports substantial improvements in 3D pose accuracy when leveraging the learned distribution in downstream tasks such as model fitting and multi-view fusion.

The authors also explore the potential of their conditional modeling framework in alternative scenarios, such as lifting 2D poses to 3D skeleton representations, showing that this methodology is not limited to human mesh recovery but adaptable across diverse inputs and outputs.

The introduction of such probabilistic modeling has profound implications for the field of 3D human pose estimation, offering a more robust framework to tackle the inherent ambiguities of the problem. It opens up new avenues for research and applications, particularly in settings where integrating multiple sources of evidence can substantially improve pose accuracy. Future studies might investigate extending this probabilistic modeling approach to other object classes or address additional ambiguities, such as the depth-size trade-off, to further advance our ability to derive 3D information from 2D observations.

Probabilistic Modeling for Human Mesh Recovery (2108.11944v1)

Summary

Probabilistic Modeling for Human Mesh Recovery

Related Papers