Neural Riemannian Motion Fields (NRMF)

Updated 18 September 2025

Neural Riemannian Motion Fields are a framework that models human motion as trajectories on Riemannian manifolds, integrating joint rotations, velocities, and accelerations.
They employ neural distance fields and an adaptive-step hybrid projection algorithm to ensure temporal coherence and physical plausibility in motion recovery and synthesis.
NRMF demonstrates superior performance in denoising, in-betweening, and partial fitting tasks on benchmarks like AMASS and 3DPW, making it valuable for graphics and robotics applications.

Neural Riemannian Motion Fields (NRMF) are a class of generative models and priors for human motion that combine neural networks with the mathematical structure of Riemannian geometry on articulated joint spaces. Unlike conventional approaches that treat motion as sequences of poses or leverage variational autoencoders (VAEs) or diffusion models, NRMF explicitly models motion as a trajectory on the product manifold of joint rotations, angular velocities, and accelerations. This approach rigorously enforces physical plausibility, temporal coherence, and respect for the manifold structure of articulated bodies via neural distance fields (NDFs) and geometry-aware optimization and integration techniques (Yu et al., 11 Sep 2025). NRMF introduces an adaptive-step hybrid projection algorithm and a geometric integrator to project or "roll out" plausible motion trajectories, yielding significant advancements in motion recovery, denoising, in-betweening, and fitting tasks.

1. Mathematical Foundations and Motion Representation

NRMF models human motion on the product space of joint rotations, velocities, and accelerations,

$X = (t, R, \omega, \alpha)$

where $t \in \mathbb{R}^3$ is global translation, $R \in SO(3)^{N_J}$ encodes joint orientations, $\omega \in so(3)^{N_J}$ are angular velocities, and $\alpha \in \mathbb{R}^{3 \times N_J}$ are angular accelerations. Motion is represented as a curve or trajectory on the manifold $\mathcal{M}$ formed by the product of these spaces.

The core technical device is a collection of neural distance fields,

$f_\Gamma(X) = \left[f_\theta(X), f_{\dot{\theta}}(X), f_{\ddot{\theta}}(X)\right],$

which estimates the unsigned geodesic distances from any state $X$ to the set of plausible motions as present in the training corpus. The zero level set

$\mathcal{S} = \{ X \in \mathcal{M} \mid f_\Gamma(X) = 0 \}$

acts as an implicit manifold of physically plausible motions.

Distances in $SO(3)$ are computed as

$d_{SO}(R, R') = \cos^{-1}\left(\frac{\mathrm{Tr}(R^\top R') - 1}{2}\right),$

and for an articulated body,

$d_{SO^{N_J}}(R, R') = \| (d_{SO}(R_1, R'_1), d_{SO}(R_2, R'_2), \ldots, d_{SO}(R_{N_J}, R'_{N_J})) \|_p,$

with $p=1$ typically selected.

This structure allows for the precise definition and manipulation of motion in geometric terms, capturing joint positions, transitions, and smoothness.

2. Neural Distance Field (NDF) Construction and Training

NRMF employs three distinct neural distance fields:

Pose Field (0th order): Encodes plausibility of static body postures.
Transition Field (1st order): Measures plausibility of pose transitions (angular velocities).
Acceleration Field (2nd order): Enforces physically plausible accelerations and smoothness.

Each field is trained to estimate the geodesic distance to the closest point in the training set—minimizing mismatch via a nearest-neighbor matching loss computed on the product Riemannian manifold. The NDFs are implemented as neural networks and are constructed on the product space of joint rotations and their derivatives.

The architecture respects the geometry of rotations, with gradients computed using the Riemannian structure and projected onto the tangent space. For rotations: $\mathrm{grad}\, f(R) = R\,\mathrm{skew}(R^\top\nabla f(R)) = \frac{1}{2} R\left(R^\top\nabla f(R) - \nabla f(R)^\top R\right)$ The unsigned geodesic distances guide projection during optimization.

3. Adaptive-Step Hybrid Projection Algorithm

To recover plausible motion from potentially noisy or partially observed input, NRMF introduces a three-stage projection algorithm, sequentially minimizing the corresponding NDF losses using Riemannian gradient descent:

Stage I: Optimizes the pose NDF.
Stage II: Refines the transition (velocity) NDF.
Stage III: Corrects using the acceleration NDF.

Update steps involve geometry-aware operations: the exponential map on the product manifold is used for rotations, angular velocities are updated via Euler integration, and adaptive step sizes are tuned for stability,

$R^{(t+1)} \leftarrow \mathrm{Exp}_{R^{(t)}}\left(-\alpha_\theta \frac{\mathrm{grad}\,f_\theta(R^{(t)})}{\|\mathrm{grad}\,f_\theta(R^{(t)})\|}\right)$

and

$\omega_{t} \leftarrow \omega_{t-1} + \lambda_t\, \alpha_{t-1}$

where $[\omega_{t-1}]_\times$ denotes the skew-symmetric matrix associated with the angular velocity.

This hybrid approach enables effective projection not only in pose space but also for dynamic aspects and smooth trajectory evolution.

4. Geometric Integrator for Trajectory Generation

NRMF deploys a geometric integrator designed to maintain temporal coherence and correct drift during motion synthesis. The integrator advances motion states sequentially:

Updates velocities and accelerations using projected Euler integration.
Uses the exponential map $\mathrm{Exp}_{R_t}\left(\alpha_t [\omega_t]_\times\right)$ to update joint rotations over time.
Repeatedly applies the projection algorithm to keep the motion sequence within the plausible manifold defined by the NDF zero level sets.

Via these geometric steps, the integrator enforces smoothness, physical plausibility, and consistency of the generated motion across frames.

5. Experimental Validation and Performance

NRMF demonstrates significant performance across several benchmarks:

AMASS dataset training: Yields reduced mean per-joint position error (MPJPE), decreased acceleration error, and improved smoothness metrics relative to HuMoR, NRDF, and PhaseMP baselines.
Motion denoising: Effectively restores plausible trajectories from noisy input.
Motion in-betweening: Generates realistic transitions when given sparse keyframes.
Partial 2D/3D fitting: Recovers plausible motion sequences from incomplete observations, including RGB and RGB-D input.
Generalization: NRMF generalizes robustly to in-the-wild datasets (e.g., 3DPW, i3DB, PROX, EgoBody); recovered motions show higher temporal consistency, plausibility, and resilience to occlusion and observation noise.

Quantitative metrics consistently favor NRMF, including lower FID for pose/motion, better contact consistency, and reduced motion propagation error. Spherical KDE analysis of generated joint distributions reveals close adherence to the AMASS training set.

6. Applications and Broader Implications

NRMF enables robust human motion recovery in real-world conditions and has applications in:

Motion capture and reconstruction systems in graphics, film, avatar animation, and gaming.
Pose refinement for mesh recovery pipelines (enhancing SMPLer-X, VPoser, etc.).
Generative motion modeling for interpolation, synthesis, and sports analysis.
Robotics and clinical motion diagnostics, where enforcing physical constraints and plausible dynamics is vital.

The explicit use of Riemannian product manifolds for joint space models opens future directions in robot imitation learning, clinical diagnostics, and other fields requiring fidelity to articulated dynamics.

7. Relationship to Broader Riemannian and Neural Motion Field Research

NRMF's design is directly related to foundational work on Riemannian Motion Policies (RMPs) (Ratliff et al., 2018), which introduced modular second-order policies coupled with Riemannian metrics for controller fusion. However, NRMF advances these concepts by learning the implicit manifold of plausible motions with neural networks and by handling temporal dynamics at multiple orders. The hybrid projection and integration mechanisms represent a geometric extension of previous motion field architectures, rigorously enforcing manifold structure and dynamic consistency.

In broader context, NRMF can be viewed as a geometric, learning-based framework that bridges neural motion priors with Riemannian geometry and articulated kinematic constraints, offering a modular and theoretically principled toolkit for high-fidelity motion inference and synthesis.

PDF Markdown Chat (Pro)

References (2)

Geometric Neural Distance Fields for Learning Human Motion Priors (2025)

Riemannian Motion Policies (2018)

Follow Topic

Get notified by email when new papers are published related to Neural Riemannian Motion Fields (NRMF).