Skeleton-Aware Networks for Deep Motion Retargeting (2005.05732v1)

Published 12 May 2020 in cs.CV, cs.GR, and cs.LG

Abstract: We introduce a novel deep learning framework for data-driven motion retargeting between skeletons, which may have different structure, yet corresponding to homeomorphic graphs. Importantly, our approach learns how to retarget without requiring any explicit pairing between the motions in the training set. We leverage the fact that different homeomorphic skeletons may be reduced to a common primal skeleton by a sequence of edge merging operations, which we refer to as skeletal pooling. Thus, our main technical contribution is the introduction of novel differentiable convolution, pooling, and unpooling operators. These operators are skeleton-aware, meaning that they explicitly account for the skeleton's hierarchical structure and joint adjacency, and together they serve to transform the original motion into a collection of deep temporal features associated with the joints of the primal skeleton. In other words, our operators form the building blocks of a new deep motion processing framework that embeds the motion into a common latent space, shared by a collection of homeomorphic skeletons. Thus, retargeting can be achieved simply by encoding to, and decoding from this latent space. Our experiments show the effectiveness of our framework for motion retargeting, as well as motion processing in general, compared to existing approaches. Our approach is also quantitatively evaluated on a synthetic dataset that contains pairs of motions applied to different skeletons. To the best of our knowledge, our method is the first to perform retargeting between skeletons with differently sampled kinematic chains, without any paired examples.

Citations (161)

View on Semantic Scholar

Summary

The paper introduces a framework that reduces diverse skeletons into a common primal skeleton through skeletal pooling.
It employs novel skeletal-aware convolution, pooling, and unpooling operators to extract deep temporal features from joint hierarchies.
Experiments on synthetic datasets demonstrate the method’s superior performance in delivering accurate, visually plausible motion retargeting.

Skeleton-Aware Networks for Deep Motion Retargeting

The paper "Skeleton-Aware Networks for Deep Motion Retargeting" introduces a novel deep learning framework designed for data-driven motion retargeting between skeletons with varying structures. This research addresses a fundamental challenge in motion capture systems where the structural discrepancies between different skeletons necessitate a comprehensive retargeting algorithm capable of handling these heterogeneities.

Key Contributions

Skeletal Pooling and Common Primal Skeleton: The authors introduce the concept of skeletal pooling, which involves reducing different homeomorphic skeletons into a common primal skeleton. This is achieved through a series of edge merging operations, allowing for a shared latent space across skeletons.
Differentiable Operators: The paper presents new skeletal-aware convolution, pooling, and unpooling operators that incorporate the hierarchical structure and joint adjacency of skeletons. These differentiated operators transform motion into deep temporal features associated with the joints of the primal skeleton.
Motion Processing Framework: This framework facilitates the embedding of motion into a latent space shared by a variety of homeomorphic skeletons, enabling retargeting through encoding and decoding processes between different structural skeletons.

Experimental Results

The authors conduct experiments that demonstrate the effectiveness of their framework for motion retargeting and processing. These experiments are performed on a synthetic dataset containing pairs of motions applied to different skeleton structures, establishing the proposed method's capability to handle skeleton variations without requiring paired examples. The numerical evaluation shows that the approach outperforms existing methods in delivering accurate and visually plausible motion sequences.

Implications and Future Directions

The implications of this work are substantial in the field of motion capture and animation. The ability to retarget motions across differently structured skeletons without manual intervention or paired data paves the way for more versatile animation techniques and cross-platform applications in virtual environments. Theoretically, this framework could be expanded to accommodate non-homeomorphic structures by identifying more complex reduction patterns or integrating alternative pooling strategies.

Future developments in AI might focus on further refining these differentiable operators to enhance efficiency or exploring the potential of this framework in real-time applications. Additionally, scaling the model to encompass a broader range of motion capture scenarios, including non-human or hybrid skeletons, remains an exciting research avenue.

Overall, this paper provides a comprehensive framework that addresses a long-standing challenge in motion retargeting while setting the stage for future innovations in motion processing and skeleton-transcendent motion representation.

PDF Markdown