ECON: Explicit Clothed humans Optimized via Normal integration (2212.07422v2)

Published 14 Dec 2022 in cs.CV, cs.AI, and cs.GR

Abstract: The combination of deep learning, artist-curated scans, and Implicit Functions (IF), is enabling the creation of detailed, clothed, 3D humans from images. However, existing methods are far from perfect. IF-based methods recover free-form geometry, but produce disembodied limbs or degenerate shapes for novel poses or clothes. To increase robustness for these cases, existing work uses an explicit parametric body model to constrain surface reconstruction, but this limits the recovery of free-form surfaces such as loose clothing that deviates from the body. What we want is a method that combines the best properties of implicit representation and explicit body regularization. To this end, we make two key observations: (1) current networks are better at inferring detailed 2D maps than full-3D surfaces, and (2) a parametric model can be seen as a "canvas" for stitching together detailed surface patches. Based on these, our method, ECON, has three main steps: (1) It infers detailed 2D normal maps for the front and back side of a clothed person. (2) From these, it recovers 2.5D front and back surfaces, called d-BiNI, that are equally detailed, yet incomplete, and registers these w.r.t. each other with the help of a SMPL-X body mesh recovered from the image. (3) It "inpaints" the missing geometry between d-BiNI surfaces. If the face and hands are noisy, they can optionally be replaced with the ones of SMPL-X. As a result, ECON infers high-fidelity 3D humans even in loose clothes and challenging poses. This goes beyond previous methods, according to the quantitative evaluation on the CAPE and Renderpeople datasets. Perceptual studies also show that ECON's perceived realism is better by a large margin. Code and models are available for research purposes at econ.is.tue.mpg.de

Citations (121)

View on Semantic Scholar

Summary

The paper introduces ECON, a novel method for reconstructing detailed 3D clothed human models from single color images.
The paper employs a three-stage process using high-fidelity normal map inference, 2.5D surface reconstruction via Bilateral Normal Integration, and IF-Nets+ for robust shape completion.
The paper demonstrates ECON's superiority over existing methods with significant reductions in Chamfer and P2S distances on CAPE and Renderpeople datasets.

An Insightful Overview of ECON: Explicit Clothed Humans Optimized via Normal Integration

This paper introduces ECON, a novel method for reconstructing 3D clothed human models from single color images. Addressing a significant challenge in computer vision, ECON combines free-form implicit representations with explicit anthropomorphic regularization to achieve robust and detailed human digitization.

Methodology

ECON operates through three distinct stages:

Normal Map Inference: The process begins by predicting high-fidelity 2D normal maps of the front and back sides of a clothed human figure. This step takes advantage of current networks' ability to extract detailed 2D representations, serving as a foundation for subsequent 3D reconstruction.
Surface Reconstruction: Normal maps are transformed into 2.5D partial surfaces. The authors adapt the Bilateral Normal Integration (BiNI) method to incorporate coarse prior knowledge, aligning detailed components with the SMPL-X body model to ensure coherency.
3D Shape Completion: In the final step, the missing geometry between the surfaces is "inpainted." An enhanced version of IF-Nets, termed IF-Nets+, is introduced, conditioned on the SMPL-X body to provide robustness against pose variations and occlusions, thus completing the 3D human model.

Numerical Results

The evaluation on CAPE and Renderpeople datasets shows ECON's performance surpasses existing methods quantitatively and qualitatively. ECON achieves a balance between robustness to challenging poses and adaptability to various clothing topologies. It demonstrates a significant reduction in Chamfer and P2S distance scores, indicating improved accuracy in capturing human geometry.

Implications and Future Developments

ECON's ability to effectively combine implicit and explicit techniques opens avenues for applications in virtual reality, gaming, and avatar generation for the metaverse. Future research could focus on extending ECON to handle dynamic scenes with multi-person interactions or investigating more efficient implementations that reduce computational demands. Enhancing ECON with texture and kinematic data could further enrich its applicability for realistic avatar creation.

While ECON marks progress, the reliance on accurate SMPL-X estimations highlights a dependency on precise human body recovery from images—a challenge that remains partially unsolved. As synthetic datasets become more sophisticated, the gap between real-world images and model performance may further narrow.

The release of ECON’s code and models for research underscores its potential as a foundational tool in 3D human reconstruction, encouraging exploration and innovation within the domain.

In conclusion, ECON presents a significant advancement in reconstructing detailed 3D human figures from images. Its integration of normal maps, 2.5D surfaces, and infill techniques positions it as a versatile and robust method set to inspire future developments in the field.

PDF Markdown

Related Papers

GitHub

GitHub - YuliangXiu/ECON: [CVPR'23, Highlight] ECON: Explicit Clothed humans Optimized via Normal integration (1,164 stars)

YouTube

Show All Videos