Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model (2010.11234v2)

Published 21 Oct 2020 in cs.RO

Abstract: In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned policy that can bridge the gap between the idealized, simple model and the complex, full order robot. The high-level planner can use a model of the environment and be task specific, while the low-level learned controller can execute a wide range of motions so that it applies to many different tasks. In this letter we describe this learned dynamic walking controller and show that a range of walking motions from reduced-order models can be used as the command and primary training signal for learned policies. The resulting policies do not attempt to naively track the motion (as a traditional trajectory tracking controller would) but instead balance immediate motion tracking with long term stability. The resulting controller is demonstrated on a human scale, unconstrained, untethered bipedal robot at speeds up to 1.2 m/s. This letter builds the foundation of a generic, dynamic learned walking controller that can be applied to many different tasks.

PDF Abstract

Learning Spring Mass Locomotion: An Overview

The paper "Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model" presents an innovative approach in the domain of bipedal robot locomotion, specifically focusing on integrating reduced-order models with reinforcement learning to achieve dynamic legged locomotion. The authors, Kevin Green and collaborators, leverage the actuated Spring Loaded Inverted Pendulum (SLIP) model as a reduced-order model to create a control framework that guides a real-world bipedal robot, Cassie, in emulating dynamic walking patterns.

Methodology

The paper develops a hierarchical control system where high-level behaviors are captured by reduced-order models and translated into executable motions through learned policies. This division of labor allows the exploitation of simple, predictive models for planning, while using complex models to handle real-world dynamics and uncertainties. The main novelty lies in using a library of precomputed SLIP model motions to inform the low-level control policies rather than relying solely on pre-defined trajectories.

Reduced-Order Model

Reduced-order modeling, particularly the SLIP model, is central to this work. This model efficiently captures the essential dynamics of legged locomotion using a simplified representation involving a point mass and spring-like legs. The authors optimized this model for various speeds using direct collocation, generating a suite of energetically optimal gaits. These gaits serve as the primary reference for the policy training process. Such models are advantageous due to their computational simplicity and their ability to generate meaningful, albeit high-level, motion trajectories.

Reinforcement Learning and Control

Reinforcement Learning (RL) is employed to derive policies that transform high-level motion patterns from the SLIP model into real-world motor commands. The specific algorithm used is Proximal Policy Optimization (PPO), known for its robustness and efficacy in simulations. The policy's input is a combination of estimated robot states and desired task-space configurations. The reward structure heavily penalizes deviations from the SLIP model trajectories, thereby aligning the learned policies with the intended spring-mass dynamics.

Implementation and Results

The authors successfully implemented this approach with Cassie, a bipedal robot designed by Agility Robotics. Through simulation and hardware experiments, the learned policies showcased the ability to track a wide range of speeds and transition smoothly between them. Notably, the robot exhibited dynamic behaviors, such as velocity oscillations aligning with the SLIP model's predictions. The experimental results demonstrate that the derived controller can achieve stable bipedal locomotion at speeds up to 1.2 m/s, thus affirming the approach's applicability to real-world scenarios.

Implications and Future Work

This research provides a effective framework for exploiting reduced-order models in dynamic locomotion control, emphasizing the potential for integrating model-based planning with model-free learning techniques. The success of this method suggests several practical applications, including real-time control of bipedal robots in complex environments. The control hierarchy not only simplifies the learning task but also provides flexibility for future developments, where higher-level planners might dynamically generate new motion references based on environmental feedback.

Conclusion

The control scheme outlined in this paper demonstrates a promising pathway towards scalable and adaptable locomotion in robots, marrying the simplifications of reduced-order models with the adaptability of reinforced learning. Going forward, extending this approach to include real-time adaptation to a broader range of environmental conditions and tasks will be crucial. Additionally, exploring the integration of these strategies in robots with different morphologies could lead to more generalized solutions in dynamic robotic locomotion.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Kevin Green (12 papers)
Yesh Godse (2 papers)
Jeremy Dao (14 papers)
Ross L. Hatton (27 papers)
Alan Fern (60 papers)
Jonathan Hurst (15 papers)

Citations (40)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/RoboReading/status/1822007351697928300

YouTube

Show All Videos