Learning human behaviors from motion capture by adversarial imitation (1707.02201v2)

Published 7 Jul 2017 in cs.RO, cs.LG, and cs.SY

Abstract: Rapid progress in deep reinforcement learning has made it increasingly feasible to train controllers for high-dimensional humanoid bodies. However, methods that use pure reinforcement learning with simple reward functions tend to produce non-humanlike and overly stereotyped movement behaviors. In this work, we extend generative adversarial imitation learning to enable training of generic neural network policies to produce humanlike movement patterns from limited demonstrations consisting only of partially observed state features, without access to actions, even when the demonstrations come from a body with different and unknown physical parameters. We leverage this approach to build sub-skill policies from motion capture data and show that they can be reused to solve tasks when controlled by a higher level controller.

Authors (8)

Josh Merel (31 papers)
Yuval Tassa (31 papers)
Dhruva TB (5 papers)
Sriram Srinivasan (23 papers)
Jay Lemmon (3 papers)
Ziyu Wang (137 papers)
Greg Wayne (33 papers)
Nicolas Heess (139 papers)

Citations (196)

View on Semantic Scholar

Summary

The paper extends GAIL to train humanoid controllers from partial motion capture data without needing explicit action labels.
It overcomes reward engineering challenges by using adversarial imitation to replicate fluid and complex human movements.
Empirical evaluations validate smooth behavior transitions and robust transfer across different body structures and dynamic tasks.

Analysis of "Learning Human Behaviors from Motion Capture by Adversarial Imitation"

The paper "Learning Human Behaviors from Motion Capture by Adversarial Imitation" by Merel et al. explores the application of generative adversarial imitation learning (GAIL) to train neural network policies that exhibit humanlike behaviors using motion capture data. This approach marks a significant stride in addressing limitations inherent in standard reinforcement learning (RL) methods for controlling humanoid bodies, which often result in behaviors that do not exhibit the fluidity and naturalness characteristic of human movement.

The major contribution of the paper lies in extending GAIL to handle limited demonstrations consisting only of partially observed state features without access to corresponding actions. This proves particularly advantageous when movement data is derived from bodies with different or unknown physical parameters, as frequently encountered with motion capture datasets. Using this method, the authors developed low-level controllers capable of handling complex and dynamically inconsistent motion data.

Methodological Advancements

The proposed approach leverages GAIL to develop humanoid behaviors, mitigating the need for explicitly designed reward functions. Standard RL techniques for humanoid control often require meticulous reward engineering and domain expertise, limiting their scalability and flexibility. The authors to a large extent circumvent this by using GAIL to create a pipeline for learning and embedding humanlike sub-skills from motion capture for diverse tasks. Key extensions to the GAIL framework demonstrated include:

Partial State Featurizations: Sufficient for training viable control policies by omitting actions during adversarial imitation, thus simplifying the learning process.
Body Transfer Invariance: GAIL effectively generalizes across different body structures without needing matching dynamics, enabling the re-targeting of motion features.
Robust Multi-Behavior Policies: The approach facilitates smooth transitions between learned behaviors, a critical feature for task composition in complex environments.

Empirical Evaluation

The empirical results underscore the effectiveness of the extended GAIL framework on various experimental setups. Initial validation on simpler entities like planar arms and bipedal walkers demonstrated the feasibility of learning from partial observations and transferring learned behaviors across different body morphologies. This setup allowed performance quantification against known baseline RL policies, emphasizing the robustness of the imitation-derived policies.

Further experiments focused on recreating humanoid behaviors from the CMU Graphics Lab Motion Capture Database in a physically simulated environment. Noteworthy results included:

Training models to imitate specific behaviors such as walking and getting up, showing greater robustness against the noise and dynamic inconsistencies typical of raw motion capture data.
Conducting experiments to learn stylized movements from limited demonstrations, which revealed the model's capacity to discern and replicate unique gait characteristics from minimal data inputs.

Practical and Theoretical Implications

The research provides compelling insights into scalable robotic control methodologies. The demonstrated ability to learn and synthesize complex motor actions from motion capture data without extensive manipulation highlights the potential for autonomous agents to exhibit realistic and adaptive behaviors. Furthermore, it posits a pathway towards developing general-purpose humanoid robotics that leverage vast datasets of human activity, ultimately enhancing their integration into human environments and interactions.

In theory, this advancement aligns with broader goals in AI to achieve more naturalistic and adaptable settings without explicit coding for every conceivable action or scenario. Transitioning between learned behaviors and reusing sub-skills for unseen tasks foreshadows the development of highly adaptable humanoid systems.

Future Directions

The research sets a foundation for future work in several key areas:

Scaling and Generalization: Expanding the repertoire of behaviors and refining them across larger, more varied datasets could enhance this approach's applicability.
Advanced Contextual Modulation: Furthering the development of high-level controllers capable of seamlessly integrating various sub-skills to tackle increasingly complex tasks autonomously.
Integration with Other Modalities: Incorporating additional sensory modalities such as vision and auditory inputs can enhance task performance and agent-environment interaction, making the learned behaviors more holistic.

In summary, this paper contributes a disciplined approach to advancing the field of humanoid robotics through the extension and application of GAIL. The exploration of adversarial imitation from partial observations paves the way for developing advanced controller architectures capable of learning intricate and humanlike forms of movement with minimal human intervention.

PDF Markdown

Related Papers

YouTube

Show All Videos