Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields (2207.13807v1)

Published 27 Jul 2022 in cs.CV

Abstract: We present Pose-NDF, a continuous model for plausible human poses based on neural distance fields (NDFs). Pose or motion priors are important for generating realistic new poses and for reconstructing accurate poses from noisy or partial observations. Pose-NDF learns a manifold of plausible poses as the zero level set of a neural implicit function, extending the idea of modeling implicit surfaces in 3D to the high-dimensional domain SO(3)^K, where a human pose is defined by a single data point, represented by K quaternions. The resulting high-dimensional implicit function can be differentiated with respect to the input poses and thus can be used to project arbitrary poses onto the manifold by using gradient descent on the set of 3-dimensional hyperspheres. In contrast to previous VAE-based human pose priors, which transform the pose space into a Gaussian distribution, we model the actual pose manifold, preserving the distances between poses. We demonstrate that PoseNDF outperforms existing state-of-the-art methods as a prior in various downstream tasks, ranging from denoising real-world human mocap data, pose recovery from occluded data to 3D pose reconstruction from images. Furthermore, we show that it can be used to generate more diverse poses by random sampling and projection than VAE-based methods.

Citations (83)

View on Semantic Scholar

Summary

The paper introduces Pose-NDF, a neural distance field approach that directly models human pose manifolds in SO(3)^K using quaternions for smooth and realistic pose generation.
It leverages a hierarchical network structure to encode kinematic skeletons and outperforms alternatives like VPoser and HuMoR in motion denoising and 3D pose estimation.
The approach enables diverse interpolation and pose generation, offering significant benefits for applications in animation, AR/VR, and interactive body modeling.

Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields

The paper discusses Pose-NDF, a novel methodology for modeling human pose manifolds using Neural Distance Fields (NDFs). This approach enables a continuous representation of plausible human poses, which can significantly enhance the realism and accuracy of pose generation and reconstruction, especially when dealing with noisy or incomplete data.

Core Concepts and Methodology

Pose-NDF is introduced as a human pose prior utilizing neural fields to create a manifold of plausible poses. Unlike previous VAE-based models that map pose space to a Gaussian distribution, Pose-NDF operates directly within the high-dimensional domain $SO(3)^K$ , where $K$ denotes the number of joints in a human body. It employs quaternions for joint representation, offering advantages in continuous parameter space, efficient distance computation, and gradient descent optimization within $SO(3)$ .

The model is structured hierarchically, encoding the human pose based on the skeleton's kinematic structure. The neural network projects arbitrary poses onto the plausible pose manifold by minimizing their distance using a gradient descent algorithm. This novel pose representation effectively preserves distances between pose configurations, providing smoother transitions and more diverse pathways for interpolation and generation.

Numerical Results and Experimental Observations

Pose-NDF demonstrates superiority in several downstream tasks by outperforming state-of-the-art alternatives. Notable applications include:

Motion Denoising & Recovery: The methodology shows considerable improvement in refining motion capture data with artifacts and occlusions, delivering the least error (numerically improved by several centimeters) in both realistic and artificial noisy datasets compared to competing methods like VPoser and HuMoR.
3D Pose Estimation from Images: Pose-NDF is successfully employed as a pose prior in optimization-based methods for 3D pose estimation from images, exceeding the performance of the previous leading-edge methods GAN-S and VPoser by beneficially harnessing the manifold's proximity information to guide the optimization process.
Pose Generation and Interpolation: The continuous manifold approach enables diverse pose generation and smooth interpolation between poses. Quantitative metrics such as Average Pairwise Distance (APD) indicate greater pose diversity without compromising realism, in contrast to GMM and VPoser, which can generate poses biased towards the mean configuration.

Practical and Theoretical Implications

This paper offers substantial contributions to the theoretical modeling of human poses in high-dimensional space and presents practical advancements in human pose representation technologies. The development of Pose-NDF as a direct manifold in $SO(3)^K$ mitigates the limitations posed by traditional VAE models reliant on Gaussian distributions, such as proximity bias towards the mean and lack of representation for discontinuous regions in latent space.

The practical implications are profound, offering improvements in motion capture cleaning, image-based pose estimation, and diverse pose generation crucial for applications in animation, AR/VR, and interactive body modeling technologies.

Future Research Directions

The scalability of Pose-NDF as an underlying framework for pose representation in broader contexts, such as dynamic scene understanding or interactive human-object dynamics, presents intriguing possibilities. Future research could explore expansions incorporating temporal dynamics or adaptive manifold representations to improve accuracy in real-time applications and interactions involving complex environments. Additionally, exploring integration with other networks and augmentation with additional data modalities represents a promising avenue to enhance robustness and diversity in human-centric visual computing.

In conclusion, Pose-NDF represents a significant advancement in the modeling of human poses, offering both theoretical insights and practical benefits. Its performance across various experimental contexts underscores its potential to serve as a cornerstone for future endeavors in human pose modeling and application-driven development within computer vision and graphics domains.

PDF Markdown

Related Papers

YouTube

Show All Videos