Template Model Inspired Task Space Learning for Robust Bipedal Locomotion (2309.15442v1)

Published 27 Sep 2023 in cs.RO

Abstract: This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions.

Authors (5)

Guillermo A. Castillo (15 papers)
Bowen Weng (25 papers)
Shunpeng Yang (5 papers)
Wei Zhang (1489 papers)
Ayonga Hereid (27 papers)

Citations (13)

View on Semantic Scholar

Summary

Hierarchical Learning Framework for Robust Bipedal Locomotion

The paper, "Template Model Inspired Task Space Learning for Robust Bipedal Locomotion," presents a hierarchical control strategy integrating reinforcement learning with model-based control to achieve robust and efficient bipedal robot locomotion. This work offers a significant contribution by demonstrating the applicability of a hierarchical learning framework to both underactuated and fully actuated robots and presenting a unique structural design that enhances the understanding and performance of bipedal locomotion controllers.

The proposed approach leverages a two-level control hierarchy: a high-level (HL) reinforcement learning policy generates task space commands based on a reduced-order state space, inspired by the Angular Momentum-based Linear Inverted Pendulum (ALIP) model, and a low-level (LL) model-based controller ensures trajectory tracking. The key feature of this approach is its deviation from typical end-to-end learning to incorporate insights from the ALIP model, resulting in a more interpretable, efficient, and robust control framework.

Key Aspects:

Reduced-State and Action Space:
- The HL policy utilizes a reduced state space incorporating the robot's base position, angular momentum, and velocity errors, inspired by the ALIP model. This reduced state facilitates the efficient mapping of complex dynamic behaviors to manageable control tasks.
- The action space includes task space commands addressing step length and torso orientation, which simplify the complexity inherent in dynamic gait characterization and enhance policy interpretability.
Decoupled Control Structure:
- The hierarchical framework distinctly separates the learning-based HL planner from the LL feedback controller, improving both sample efficiency and the interpretability of the learned policies.
- This design allows the HL policy to remain agnostic to the LL controller type, granting flexibility in employing different control strategies without retraining the HL planner.
Robust Performance Across Diverse Conditions:
- Extensive testing across various robots, including the 2D Rabbit and Walker2D and the 3D humanoid Digit, highlights the generality of the proposed control framework.
- The learned policies demonstrate enhanced robustness and stability across a diverse range of scenarios, including variable walking speeds, external disturbances, and various terrain inclinations.
Comparison and Results:
- The hierarchical framework exhibits superior performance compared to both model-based methods and end-to-end learning frameworks in terms of speed tracking, robustness to disturbances, and sample efficiency.
- Notably, the robot Digit achieves walking speeds up to 1.5 m/s in simulations, indicating the potential for agile, human-like locomotion behaviors.

Implications and Future Directions:

The implications of this work are profound for robotics research, particularly in the field of dynamic and agile bipedal locomotion. The effective integration of reinforcement learning with model-based control addresses existing challenges in real-time trajectory planning and execution, presenting a scalable and adaptable solution for complex robots. Furthermore, the approach potentially reduces the reliance on computationally expensive models in favor of more intuitive, learned control policies that are scalable across different robotic platforms.

Looking forward, the application of this framework to hardware tests and more complex locomotion tasks, such as stairs and uneven terrain, could further corroborate its utility and robustness. Additionally, integrating perceptual feedback and higher-level behavioral strategies may further enhance the adaptability and autonomy of bipedal robots in real-world environments. Such advancements align with the goal of developing robots that seamlessly interact with human environments, performing tasks with agility and stability on par with efficient human locomotion.

PDF Markdown

Related Papers

Find Related Papers