Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis (2201.01683v2)

Published 5 Jan 2022 in cs.CV

Abstract: We propose a new method for reconstructing controllable implicit 3D human models from sparse multi-view RGB videos. Our method defines the neural scene representation on the mesh surface points and signed distances from the surface of a human body mesh. We identify an indistinguishability issue that arises when a point in 3D space is mapped to its nearest surface point on a mesh for learning surface-aligned neural scene representation. To address this issue, we propose projecting a point onto a mesh surface using a barycentric interpolation with modified vertex normals. Experiments with the ZJU-MoCap and Human3.6M datasets show that our approach achieves a higher quality in a novel-view and novel-pose synthesis than existing methods. We also demonstrate that our method easily supports the control of body shape and clothes. Project page: https://pfnet-research.github.io/surface-aligned-nerf/.

Citations (44)

View on Semantic Scholar

Summary

The paper introduces a novel method combining Neural Radiance Fields (NeRF) with parametric body models (like SMPL) using surface alignment for controllable 3D human synthesis from sparse multi-view videos.
Key to the approach is a dispersed projection method creating a surface-aligned representation that uniquely maps spatial points to mesh surface locations, improving detail rendering.
Experiments demonstrate superior novel-view and novel-pose synthesis performance over existing methods, validated by metrics like PSNR and SSIM, enabling practical applications like avatars and virtual reality.

Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis

The paper "Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis" introduces a novel approach to reconstructing 3D human models with controllable properties such as pose, shape, and clothing from sparse multi-view RGB videos. This method innovatively combines neural radiance fields (NeRF) with parametric body models like SMPL to achieve free-viewpoint rendering in visual applications.

Key Contributions and Methodology

The authors address a significant challenge in neural radiance fields: the inability to distinguish between different spatial points when mapped to their nearest surface points on a mesh. This can hinder accurate rendering of detailed human shapes and appearances. To overcome this, the paper introduces a dispersed projection method that combines barycentric interpolated projection and vertex normal alignment. These techniques ensure each spatial point is correctly mapped to a unique surface-aligned representation consisting of a projected surface point and a signed height relative to the mesh surface.

Surface-aligned representation is a cornerstone of their approach, allowing for dynamic human reconstruction directly influenced by parametric body models. Incorporating surface point positions and signed heights as inputs to the neural network bolsters the model's capacity to generalize novel poses and views effectively, which is demonstrated through experiments.

Experimental Insights

In their experimental evaluation using the ZJU-MoCap and Human3.6M datasets, the authors report improved results over existing methods in novel-view and novel-pose synthesis. Quantitative metrics such as PSNR and SSIM indicate superior photorealistic rendering performance, especially when synthesizing unseen poses. The qualitative results showcase better facial details and preservation of clothing textures across various human motions, affirming the efficacy of their surface-aligned approach.

Theoretical and Practical Implications

The surface-aligned NeRF represents a technical advancement in controllable human modeling. By leveraging mesh surface alignment, the method provides straightforward modifications of human attributes like body shape and clothing through SMPL parameters, thus reducing training complexity associated with deformation fields. This capability supports practical applications ranging from virtual avatar creation and cinematic effects to clothing design and human-computer interaction scenarios.

Theoretically, the paper opens avenues for further exploration of mesh-based transformation techniques in NeRF, suggesting broader implications for reconstructing various object types with mesh structures. Potential expansions could involve integrating detailed parametric models such as SMPL-X or deploying this methodology for interactive visual applications beyond human modeling.

Future Developments

The paper suggests promising future directions in enhancing NeRF's versatility by applying surface-aligned representation to more complex and detailed mesh models. Moreover, combining it with interactive 3D mesh deformation methods may facilitate real-time manipulation in virtual reality environments, significantly influencing how synthetic human models and other 3D objects can be dynamically generated and utilized.

In conclusion, this paper's approach offers significant advances in 3D human synthesis, providing both the scientific community and industry professionals with a novel methodology to explore and expand upon for creating highly detailed and controllable 3D models.