Robust Motion In-betweening (2102.04942v1)

Published 9 Feb 2021 in cs.CV, cs.GR, and cs.LG

Abstract: In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial recurrent neural networks. The system synthesizes high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios. To quantitatively evaluate performance on transitions and generalizations to longer time horizons, we present well-defined in-betweening benchmarks on a subset of the widely used Human3.6M dataset and on LaFAN1, a novel high quality motion capture dataset that is more appropriate for transition generation. We are releasing this new dataset along with this work, with accompanying code for reproducing our baseline results.

Citations (219)

View on Semantic Scholar

Summary

The paper presents a robust motion in-betweening method that uses a time-to-arrival embedding and scheduled target noise to overcome motion gaps and stalling artifacts.
It leverages adversarial recurrent neural networks to integrate varying transition lengths, enabling the synthesis of diverse and high-quality animations.
Quantitative evaluations on Human3.6M and LaFAN1 datasets confirm improved performance with smoother, more accurate transitions compared to traditional interpolation methods.

Overview of "Robust Motion In-betweening"

The paper "Robust Motion In-betweening" presents an innovative approach to transition generation in 3D animation by leveraging adversarial recurrent neural networks. The proposed system aims to synthesize high-quality motion animations using temporally-sparse keyframes as constraints, an advancement akin to traditional in-betweening in animation pipelines. This paper outlines the challenges involved in converting state-of-the-art motion prediction models into robust transition generators by merely adding future keyframe conditioning.

Key Methodological Contributions

The authors introduce two significant contributions to address the identified challenges in transition generation:

Time-to-Arrival Embedding Modifier: This modifier allows the recurrent network model to handle transitions of varying lengths within a single framework. By providing a continuous, dense representation of the time remaining to reach the target keyframe, the model overcomes artifacts such as motion gaps or stalling.
Scheduled Target Noise: To enhance robustness against target distortions and enable sampling of different transitions for fixed keyframes, the model incorporates an additive noise vector. This approach forces the generator to be robust to variations and enhances adaptability in generating diverse motion transitions.

Evaluation and Results

The paper conducts both qualitative and quantitative evaluations:

Qualitative Evaluation: A custom MotionBuilder plugin demonstrated the model's capacity to perform in-betweening effectively in production scenarios. The model seamlessly integrated with the production environment and accelerated animation processes without manual authoring, showing practical applicability.
Quantitative Evaluation: The method was benchmarked on the Human3.6M and LaFAN1 datasets. Results reveal that the proposed modifications significantly enhance performance in generating smooth, accurate transitions, even over longer time horizons. Notably, the quantitative metrics such as L2 distances of global quaternions and positions confirmed the model's efficacy over simpler interpolation techniques.

Practical and Theoretical Implications

The implications of this research are manifold:

Practical Impact: The transition generation tool, synergized with keyframe animation, holds potential in reducing time and effort in creating complex character animations for video games and film industries.
Theoretical Advancement: The paper provides a comprehensive analysis of recurrent architectures augmented with latent space modifiers, contributing to the broader understanding of temporally-aware and stochastic prediction models in motion synthesis.

Future Directions

The research points to several directions for future exploration:

Enhanced Motion Style Control: Extending the model to control motion style variations explicitly could broaden the system's utility across different animation contexts.
Handling Out-of-Domain Conditions: Developing mechanisms to better handle keyframe conditions outside the range experienced during training remains a critical challenge for future work.

In conclusion, the paper presents a robust architecture for motion in-betweening that adeptly combines adversarial training with novel latent space modifications. This work significantly propels the animation technology domain, promising faster and more adaptive animation processes underpinned by advanced neural network methodologies.

PDF Markdown