DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills (1804.02717v3)

Published 8 Apr 2018 in cs.GR, cs.AI, and cs.LG

Abstract: A longstanding goal in character animation is to combine data-driven specification of behavior with a system that can execute a similar behavior in a physical simulation, thus enabling realistic responses to perturbations and environmental variation. We show that well-known reinforcement learning (RL) methods can be adapted to learn robust control policies capable of imitating a broad range of example motion clips, while also learning complex recoveries, adapting to changes in morphology, and accomplishing user-specified goals. Our method handles keyframed motions, highly-dynamic actions such as motion-captured flips and spins, and retargeted motions. By combining a motion-imitation objective with a task objective, we can train characters that react intelligently in interactive settings, e.g., by walking in a desired direction or throwing a ball at a user-specified target. This approach thus combines the convenience and motion quality of using motion clips to define the desired style and appearance, with the flexibility and generality afforded by RL methods and physics-based animation. We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills. We demonstrate results using multiple characters (human, Atlas robot, bipedal dinosaur, dragon) and a large variety of skills, including locomotion, acrobatics, and martial arts.

Authors (4)

Xue Bin Peng (52 papers)
Pieter Abbeel (372 papers)
Sergey Levine (531 papers)
Michiel van de Panne (30 papers)

Citations (472)

View on Semantic Scholar

Summary

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

In the field of character animation, DeepMimic proposes a notable methodology that marries the precision and style of motion capture data with the adaptability and robustness of physics-based simulation. The paper, authored by Peng et al., leverages known reinforcement learning (RL) algorithms to develop control policies that effectively imitate example motion clips while adapting to various morphological changes and task-oriented goals.

Methodological Insights

At its core, DeepMimic bridges the gap between data-driven motion specification and physically simulated character execution. It achieves this by adapting deep RL techniques to produce control policies capable of robustly imitating diverse example motions—including dynamic actions like flips and spins—while also responding intelligently to environmental perturbations.

The approach integrates a dual-objective model: a motion-imitation objective and a task-specific objective. This dual framework enables simulated characters to maintain stylistic fidelity to motion capture data while pursuing broader goals, such as direction-oriented locomotion or object interaction tasks.

Key components of the methodology include:

State and Action Representation: Capture of character states through relative positions, rotations, and velocities in a local coordinate frame, paired with action targets derived from PD controllers for joint orientation.
Multiple Clip Integration: Techniques to incorporate and transition between multiple motion clips, enhancing the breadth of skills learned by the agent.
Reinforcement Learning Framework: Utilization of proximal policy optimization (PPO) for training, supplemented by strategies like reference state initialization and early termination to enhance learning efficiency and exploration.

Experimental Validation

The research demonstrates the framework's effectiveness through a variety of simulated characters, including humanoids and non-humanoid figures such as a bipedal dinosaur and a dragon. The characters exhibit a wide range of skills—from locomotion and martial arts to complex acrobatics—showcasing the framework's versatility.

Notable results include:

Complex Motion Learning: Successful training of characters for intricate tasks like cartwheels and flips, with robustness to external perturbations.
Task Integration: Achievement of task-oriented objectives such as striking targets and directional walking while maintaining motion quality.
Environment and Character Retargeting: Adaptation of motion capture data to different characters and environmental settings, indicating strong generalization capabilities.

Implications and Future Directions

DeepMimic's integration of motion data with physics-based character control has implications for broader applications in animation, gaming, and robotics. By offering a method that combines realism with adaptability, it presents a path forward for developing more lifelike and responsive virtual characters.

Moving ahead, areas of exploration include:

Scalability to Larger Motion Libraries: Techniques for handling extensive motion datasets while maintaining computational efficiency.
Richer Environmental Interactions: Expanding the framework to simulate more complex interactions in diverse environments.
Deployment in Physical Systems: Adapting learned policies for real-world robotic systems to handle dynamic tasks and unpredictable environments.

Peng et al.'s work marks a significant step toward more integrated and versatile character animation systems, suggesting a myriad of potential developments in both virtual and physical domains.

PDF Markdown