Whole-Body Constrained Learning for Legged Locomotion via Hierarchical Optimization (2506.05115v1)

Published 5 Jun 2025 in cs.RO

Abstract: Reinforcement learning (RL) has demonstrated impressive performance in legged locomotion over various challenging environments. However, due to the sim-to-real gap and lack of explainability, unconstrained RL policies deployed in the real world still suffer from inevitable safety issues, such as joint collisions, excessive torque, or foot slippage in low-friction environments. These problems limit its usage in missions with strict safety requirements, such as planetary exploration, nuclear facility inspection, and deep-sea operations. In this paper, we design a hierarchical optimization-based whole-body follower, which integrates both hard and soft constraints into RL framework to make the robot move with better safety guarantees. Leveraging the advantages of model-based control, our approach allows for the definition of various types of hard and soft constraints during training or deployment, which allows for policy fine-tuning and mitigates the challenges of sim-to-real transfer. Meanwhile, it preserves the robustness of RL when dealing with locomotion in complex unstructured environments. The trained policy with introduced constraints was deployed in a hexapod robot and tested in various outdoor environments, including snow-covered slopes and stairs, demonstrating the great traversability and safety of our approach.

Summary

The paper presents a novel framework that integrates model-based constraints with reinforcement learning to enhance safety in legged locomotion.
It employs a hierarchical optimization approach with a teacher-student architecture to manage dynamic, kinematic, and torque constraints in real time.
Extensive experiments on hexapod robots demonstrate significant reductions in collisions, torque exceedances, and foot slippage compared to baseline methods.

Whole-Body Constrained Learning for Legged Locomotion via Hierarchical Optimization

The paper "Whole-Body Constrained Learning for Legged Locomotion via Hierarchical Optimization" presents a novel framework that enhances the safety and adaptability of reinforcement learning (RL)-based locomotion policies in legged robots. Traditional RL methods, widely recognized for their robustness and agility across challenging environments, have safety limitations due to the sim-to-real gap, which often results in joint collisions, excessive torque, and foot slippage. This work introduces a comprehensive approach that integrates model-based constraints into RL policies to mitigate these issues.

Methodology and Framework

The proposed framework employs a hierarchical optimization-based whole-body follower as the low-level controller within the RL paradigm. This integration allows for the definition of various constraints—hard constraints during training and soft constraints at deployment—to ensure safer interactions and compliance with physical limitations. The RL policies are trained to generate desired joint trajectories, which are then executed by the whole-body follower that follows the principles of Whole-Body Control (WBC).

Key components of this methodology involve:

Hierarchical Optimization: Solving quadratic programs to define constraints that track dynamic consistency, kinematic limits, and torque limits.
Foot-Terrain Interaction Modeling: Ensuring stability and avoiding foot slippage by modeling interactions with various types of terrain.
Teacher-Student Architecture: Parallel training of RL policies using privileged information to facilitate robust locomotion skills acquisition.

Experimental Results

Extensive simulation and real-world experiments validate the framework's effectiveness. Various terrains, including snow-covered slopes and icy surfaces, are traversed efficiently by a hexapod robot, showcasing superior safety and performance in comparison to baseline methods such as unconstrained RL and classical whole-body control approaches. The simulations conducted demonstrate significant reductions in hazardous situations involving slippage, collisions, and torque exceedances.

Implications and Future Directions

By adopting a hierarchical optimization approach, the paper contributes significant advancements in the safety of RL-driven legged locomotion. This integration of constraints seamlessly addresses the inherent challenges of sim-to-real transfer, providing a scalable solution adaptable to different robotic platforms and environmental conditions. Such advancements are crucial for deploying legged robots in safety-critical missions, from planetary exploration to nuclear inspections.

Future work could focus on dynamic terrain parameter identification to further enhance environmental adaptability. Moreover, exploring the deployment of this framework across diverse robotic configurations would be essential for generalizing the findings and furthering the scope of applicability.