Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning (2203.14912v1)

Published 23 Mar 2022 in cs.RO, cs.AI, and cs.LG

Abstract: In recent years, reinforcement learning (RL) has shown outstanding performance for locomotion control of highly articulated robotic systems. Such approaches typically involve tedious reward function tuning to achieve the desired motion style. Imitation learning approaches such as adversarial motion priors aim to reduce this problem by encouraging a pre-defined motion style. In this work, we present an approach to augment the concept of adversarial motion prior-based RL to allow for multiple, discretely switchable styles. We show that multiple styles and skills can be learned simultaneously without notable performance differences, even in combination with motion data-free skills. Our approach is validated in several real-world experiments with a wheeled-legged quadruped robot showing skills learned from existing RL controllers and trajectory optimization, such as ducking and walking, and novel skills such as switching between a quadrupedal and humanoid configuration. For the latter skill, the robot is required to stand up, navigate on two wheels, and sit down. Instead of tuning the sit-down motion, we verify that a reverse playback of the stand-up movement helps the robot discover feasible sit-down behaviors and avoids tedious reward function tuning.

Citations (62)

View on Semantic Scholar

Summary

The paper introduces the Multi-AMP algorithm, which uses multiple adversarial discriminators to guide policy learning for diverse locomotion styles.
The paper demonstrates simultaneous multi-task learning and dynamic style switching validated through real-world experiments on wheeled-legged robots.
The paper shows flexibility in training both data-driven and data-free skills, reducing extensive reward engineering in reinforcement learning.

Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning

The paper "Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning" presents a novel reinforcement learning (RL) framework for the acquisition and execution of advanced locomotion and maneuvering skills in robotic systems. This framework introduces Multi-AMP (Multiple Adversarial Motion Priors), a technique that extends the concept of adversarial motion priors to enable robots with highly articulated forms to learn a wide range of skills simultaneously and allows for dynamic style switching.

Context and Motivation

Reinforcement learning has shown significant promise in robotic locomotion but often requires extensive tuning of reward functions to achieve specific motion styles. Traditional methods necessitate either carefully designed objective functions or heuristics for motion selection, posing challenges in applying these approaches across diverse tasks. Adversarial learning approaches such as AMP aim to alleviate these challenges by using imitation learning to encode motion styles through discriminators that differentiate between policy-generated and motion-data-generated transitions.

Contributions of the Paper

Multi-AMP Framework: The paper introduces the Multi-AMP algorithm, which encompasses a suite of adversarial discriminators (one for each style) that guide the policy during training by providing style-related rewards. Each discriminator is responsible for differentiating between state transitions from the policy roll-out and those from the motion dataset representative of a desired style.
Simultaneous Multi-Task Learning: The framework allows policies to learn multiple skills concurrently. This contrasts with previous approaches that typically required separate policies or extensive post-processing for each skill or style.
Style Switching: Multi-AMP enables intentional switching of motion styles which can be particularly useful in real-world applications wherein a robot might need to adjust its gait or maneuvers based on environmental or task-related changes.
Validation through Real-World Experiments: The approach is validated on a wheeled-legged robot platform, demonstrating complex behaviors such as transitioning between quadrupedal and bipedal modes, ducking under obstacles, and performing various forms of locomotion without significant performance losses while switching styles.
Data-Free Skills: Besides learning skills assisted by motion datasets, the Multi-AMP framework supports training data-free skills by setting style rewards to zero, demonstrating flexibility in skill acquisition.

Implications and Future Work

The extension of adversarial motion priors to a multi-style environment could simplify the RL training process by reducing the need for extensive reward function engineering while allowing the synthesis of complex styles. Future applications could expand this technique to other domains of robot control and autonomous behavior where dynamic adaptability and nuanced style differences are advantageous. Moreover, combining Multi-AMP with advanced simulation environments could expedite the transition from simulated learning to real-world deployment, which remains a challenging aspect in robotics.

Future research might investigate optimizing the balance between policy and discriminator updates to enhance learning efficiency. Additionally, expanding the pool of motion prior styles, possibly leveraging large-scale crowdsourced or simulation-generated datasets, could yield robots with even richer behavioral repertoires and greater autonomy across diverse operation scenarios.

Overall, this paper provides a robust framework for achieving flexible, adaptive, and efficient robotic behavior through advanced reinforcement learning techniques, paving the way for future exploration and enhancement of multi-style controllers in robotics.

PDF Markdown

Related Papers

YouTube

Show All Videos