Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry (2303.08658v1)

Published 15 Mar 2023 in cs.CV and cs.GR

Abstract: A good motion retargeting cannot be reached without reasonable consideration of source-target differences on both the skeleton and shape geometry levels. In this work, we propose a novel Residual RETargeting network (R2ET) structure, which relies on two neural modification modules, to adjust the source motions to fit the target skeletons and shapes progressively. In particular, a skeleton-aware module is introduced to preserve the source motion semantics. A shape-aware module is designed to perceive the geometries of target characters to reduce interpenetration and contact-missing. Driven by our explored distance-based losses that explicitly model the motion semantics and geometry, these two modules can learn residual motion modifications on the source motion to generate plausible retargeted motion in a single inference without post-processing. To balance these two modifications, we further present a balancing gate to conduct linear interpolation between them. Extensive experiments on the public dataset Mixamo demonstrate that our R2ET achieves the state-of-the-art performance, and provides a good balance between the preservation of motion semantics as well as the attenuation of interpenetration and contact-missing. Code is available at https://github.com/Kebii/R2ET.

Citations (20)

View on Semantic Scholar

Summary

The paper presents a novel R2ET framework that uses dual neural modules to retarget motion with preserved semantic integrity and enhanced geometric adaptation.
It employs a distance-based loss with a normalized Distance Matrix and voxelized fields to significantly reduce joint MSE and prevent interpenetration.
Experimental results on the Mixamo dataset demonstrate state-of-the-art performance and promising applications in animation, VR, and metaverse technologies.

An Analytical Review of "Skinned Motion Retargeting with Residual Perception of Motion Semantics and Geometry"

The paper in question presents a novel approach to motion retargeting, introducing a Residual RETargeting network (R $^2$ ET) that leverages neural networks to adapt source character motions to target character skeletons and shapes. The innovation lies in a two-module design that addresses skeleton and geometry differences simultaneously, targeting the intrinsic challenge of preserving motion semantics while avoiding interpenetration and contact-missing phenomena.

Methodological Overview

R $^2$ ET is structured around two neural modification modules: the skeleton-aware and shape-aware modules. These modules recognize and adjust for differences in the skeleton configuration and character shape, respectively. The skeleton-aware module preserves the semantic integrity of the source motion by ensuring that nuanced actions, such as arm movements, translate accurately from one character framework to another. On the other hand, the shape-aware module is designed to prevent physical discrepancies like interpenetration by adapting the motion to the target's unique body proportions. By integrating these modifications through a balancing gate that linearly interpolates between the two module outputs, R $^2$ ET ensures a balanced retargeted motion in terms of semantic consistency and geometric plausibility.

The system's distance-based loss functions are crucial to its efficacy, providing a structured approach to model motion semantics and geometry. The methodology includes the use of a normalized Distance Matrix (DM) for joint semantic preservation and two voxelized Distance Fields, Repulsive and Attractive, for handling the interpenetration and contact fidelity.

Experimental Validation

Experimental results on the Mixamo dataset suggest that R $^2$ ET achieves superior state-of-the-art performance compared to existing frameworks. The numerical results underscore the effectiveness of R $^2$ ET: it achieves a significant reduction in mean square error (MSE) of joint positions compared to other methods, notably outperforming in terms of preserving semantics and minimizing interpenetration. The modular design of R $^2$ ET is validated through various ablations, demonstrating the impact of each component and the overall system's robustness.

Implications and Future Directions

The R $^2$ ET model marks a meaningful advancement in motion retargeting, offering a blend of semantic preservation and geometric adaptation that existing models tended to overlook. Its implications for the animation and digital avatar industries are substantial, contributing to more realistic and fidelity-enhanced character animations without the heavy computational load of post-processing.

Looking forward, the integration of these techniques with broader applications in virtual reality and metaverse technologies can potentially enhance user experience through more lifelike motion simulations. Additionally, exploring the extension of this framework to support more diversified character models, including non-humanoid entities, could further broaden the applicability of the research.

Conclusion

R $^2$ ET provides a significant methodological contribution to the domain of motion retargeting by effectively balancing the dual objectives of semantic preservation and geometric concordance. By overcoming the traditional pitfalls of motion distortion and interpenetration, this approach sets a new benchmark for both the theoretical paper and practical application of motion retargeting in AI-driven animation systems. Future explorations may build on this foundation, enhancing adaptability and extending functionality across diverse applications and media.

PDF Markdown

Related Papers

GitHub

GitHub - Kebii/R2ET: (CVPR 2023) Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry (167 stars)