Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

133 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

46 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors (2403.03122v2)

Published 5 Mar 2024 in cs.CV

Abstract: Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train NRDFs only on positive examples, we introduce a new sampling algorithm, ensuring that the geodesic distances follow a desired distribution, yielding a principled distance field learning paradigm. We then devise a projection algorithm to map any random pose onto the level-set by an adaptive-step Riemannian optimizer, adhering to the product manifold of joint rotations at all times. NRDFs can compute the Riemannian gradient via backpropagation and by mathematical analogy, are related to Riemannian flow matching, a recent generative model. We conduct a comprehensive evaluation of NRDF against other pose priors in various downstream tasks, i.e., pose generation, image-based pose estimation, and solving inverse kinematics, highlighting NRDF's superior performance. Besides humans, NRDF's versatility extends to hand and animal poses, as it can effectively represent any articulation.

References (73)

Citations (5)

View on Semantic Scholar

Summary

The paper presents a neural framework leveraging Riemannian distance fields to model realistic human pose manifolds.
It introduces an adaptive-step Riemannian optimizer that efficiently projects arbitrary articulations onto the learned pose manifold.
It details a novel sampling method for generating training data that captures well-defined articulated shapes for robust pose estimation.

Neural Riemannian Distance Fields for Learning Articulated Pose Priors

Overview

State-of-the-art articulated pose estimation and generation continue to present significant challenges in computer vision and graphics, largely due to the complexity of modeling the high-dimensional space of realistic human poses. This paper introduces Neural Riemannian Distance Fields (\newmodel{}s), a novel approach to modeling the manifold of plausible human articulations using neural fields within a high-dimensional product-quaternion space. At its core, \newmodel{} represents a methodological advancement in learning data-driven priors for articulated shapes, offering a robust framework for a wide array of applications, including pose generation, inverse kinematics, and human pose estimation from images.

Methodology

The key contributions of this paper include the introduction of a principled framework for learning Neural Distance Fields (NDFs) on Riemannian manifolds, an adaptive-step Riemannian gradient descent algorithm for efficient projection onto the pose manifold, and a novel sampling method crucial for effective pose manifold learning. Together, these components form the foundation of \newmodel{}, allowing it to model the space of realistic human poses effectively.

Neural Riemannian Distance Fields

\newmodel{}s are learned by training a hierarchical network to predict the geodesic distance to the nearest realistic pose within a given dataset. This training process is underpinned by a new sampling method on Riemannian manifolds, enabling explicit control over the resulting distance distribution of training examples. Crucially, the network predicts distances in a high-dimensional product-quaternion space, adhering closely to the geometric structure of human articulations.

Adaptive-Step Riemannian Optimizer

To project arbitrary articulations onto the learned pose manifold, the paper introduces an adaptive-step Riemannian optimizer, \RDFGrad{}. This optimization algorithm leverages the Riemannian structure of the pose space, ensuring that projections strictly adhere to the manifold of joint rotations. This approach marks a significant advancement over previous methods, accelerating convergence and enhancing the fidelity of projected poses.

Sampling for Training Data

The paper also details a novel method for generating training data, crucial for effectively learning the pose manifold. By controlling the distribution of distances in the generated training examples, the authors ensure that the learned \newmodel{} can capture detailed and well-defined articulations. This contrasts sharply with previous heuristics, which often lead to poorly defined manifolds.

Implications and Future Developments

The introduction of \newmodel{}s has broad implications for the field of AI and computer vision. By providing a robust method for learning detailed models of human pose manifolds, this research opens new avenues for realistic pose generation, accurate inverse kinematics solutions, and improved human pose estimation from images. Moreover, the versatility of \newmodel{}s extends beyond humans to modeling articulations of hands and animals, demonstrating its wide applicability.

Looking forward, the authors suggest several promising directions for further research. These include exploring noise injection during projection to enhance pose diversity, modeling manifold uncertainty, and extending the methodology to other complex articulated shapes. As the field continues to advance, \newmodel{}s offer a powerful tool for pushing the boundaries of what is possible in understanding and replicating human motion.

PDF Markdown

Tweets

https://twitter.com/tolga_birdal/status/1766812143431692627

https://twitter.com/YannanHe/status/1766855255076078045

https://twitter.com/Mahdi_Zarei/status/1777064483804140018