Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion (2009.10019v4)

Published 21 Sep 2020 in cs.RO and cs.LG

Abstract: We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago). The system consists of a high-level controller that learns to choose from a set of primitives in response to changes in the environment and a low-level controller that utilizes an established control method to robustly execute the primitives. Our framework learns a controller that can adapt to challenging environmental changes on the fly, including novel scenarios not seen during training. The learned controller is up to 85~percent more energy efficient and is more robust compared to baseline methods. We also deploy the controller on a physical robot without any randomization or adaptation scheme.

Citations (55)

View on Semantic Scholar

Collections

Summary

The paper demonstrates a contact-adaptive controller that fuses reinforcement learning for high-level decision-making with quadratic programming for precise low-level control.
It achieves up to 85% energy efficiency improvement over traditional gaits and exhibits rapid, zero-shot adaptation to dynamic test conditions.
The framework simplifies sim-to-real transfer by enabling successful deployment on the Unitree Laikago robot under challenging, real-world scenarios.

Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

The paper by Da et al. presents a novel approach to legged locomotion control in quadruped robots, specifically the Unitree Laikago. The proposed hierarchical framework uniquely integrates model-based control with reinforcement learning (RL) to create a contact-adaptive controller capable of handling dynamic and challenging environments. This paper explores the adaptability of robots to real-world scenarios, enhancing both robustness and energy efficiency in locomotion tasks.

Technical Overview

The system designed by the authors features a two-tier controller architecture. At the high level, a controller uses RL to select locomotion primitives based on environmental interactions. This decision-making is informed by the current robot state and a history of primitives used. The low-level controller applies traditional model-based control to execute these primitives by determining ground reaction forces through quadratic programming, allowing precise control over base pose and individual foot placements.

Training Protocol and Scenarios: The high-level controller's RL training is performed in simulated environments using the Isaac Gym platform. During training, the controller adapts to variations in treadmill speed and robot orientation, leading to a wide exploration of possible dynamic interactions. Significantly, the RL framework facilitates the learning of contact sequences that potentially enhance adaptability, reducing unnecessary limb movements and thus energy consumption.

Results and Key Findings

The proposed framework demonstrates substantial improvements in energy efficiency—up to 85%—compared to baseline methods such as trotting and pacing gaits. Quantitative evaluations highlight the capability of the learned controller to adapt quickly and effectively to unseen test conditions, termed "zero-shot" adaptation. Notable test scenarios include uneven treadmill operation and conditions where friction is unexpectedly low (the so-called "banana peel" test), both handled adeptly by the controller without additional training data or model adjustments.

Furthermore, the simplicity and flexibility of the hierarchical structure contribute to the ease of implementing sim-to-real transfer. The researchers successfully deployed the controller onto a physical Unitree Laikago robot with minimal modifications, overcoming a common and significant barrier in robotics research regarding the simulation-to-reality gap.

Implications and Future Directions

This work underlines the growing utility of combining model-free and model-based approaches in robotics. The use of RL for locomotion tasks in stochastic environments opens the door to broader applications where real-time adaptability and energy efficiency are critical. Future developments might explore the integration of this hierarchical framework with additional high-level tasks, such as navigation modules, enhancing the robot's autonomy in more complex scenarios.

The implications for further research are profound, specifically regarding the extension of these methodologies to other classes of robots, the enhancement of RL algorithms for faster training times, and the development of controllers that leverage other forms of environmental feedback, such as visual or auditory data. Additionally, applying this framework to multilegged robots and more intricate tasks can provide deeper insights into the biomechanics of robot gait and energy dynamics.

In summary, the paper contributes substantial advancements in the field of robotics by demonstrating how RL can be combined effectively with traditional control paradigms to produce powerful, adaptable, and efficient locomotion controllers. These insights pave the way for future explorations into adaptive robotic systems capable of more complex and autonomous operations in diverse environments.