Robust Agility via Learned Zero Dynamics Policies (2409.06125v1)

Published 10 Sep 2024 in cs.RO

Abstract: We study the design of robust and agile controllers for hybrid underactuated systems. Our approach breaks down the task of creating a stabilizing controller into: 1) learning a mapping that is invariant under optimal control, and 2) driving the actuated coordinates to the output of that mapping. This approach, termed Zero Dynamics Policies, exploits the structure of underactuation by restricting the inputs of the target mapping to the subset of degrees of freedom that cannot be directly actuated, thereby achieving significant dimension reduction. Furthermore, we retain the stability and constraint satisfaction of optimal control while reducing the online computational overhead. We prove that controllers of this type stabilize hybrid underactuated systems and experimentally validate our approach on the 3D hopping platform, ARCHER. Over the course of 3000 hops the proposed framework demonstrates robust agility, maintaining stable hopping while rejecting disturbances on rough terrain.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a Zero Dynamics Policy that reduces the control problem’s dimensionality while ensuring robust agility in underactuated systems.
It demonstrates rapid, stable locomotion through over 3000 hops on the ARCHER platform across varied and uneven terrains.
Rigorous proofs and optimal control techniques guarantee exponential system stability and significantly lower computational overhead.

Robust Agility via Learned Zero Dynamics Policies

The paper "Robust Agility via Learned Zero Dynamics Policies" presents a novel approach to control the dynamics of hybrid underactuated systems, specifically targeting rapid and stable locomotion. The authors introduce an innovative framework that leverages the intrinsic zero dynamics of underactuated systems to create controllers that maintain stability and robustness, particularly in complex scenarios such as 3D hopping on uneven terrains.

Underactuated systems, like legged robots, are challenging to control due to their complex dynamics and constraints. Traditional control strategies, such as Model Predictive Control (MPC), although effective, often require extensive online computations and are constrained by hardware limitations in their predictive accuracy. The Zero Dynamics Policy (ZDP) proposed by this work significantly reduces the dimensionality of the control problem without sacrificing stability, promising efficient real-time performance.

The methodology centers on dividing the control design into two main tasks. First, a mapping invariant under optimal control is learned, allowing the system to exploit reduced-dimensional unactuated dynamics. Then, a controller stabilizes the actuated coordinates to the mapping output. Notably, reducing the control inputs to focus solely on the unactuated states allows for a decrease in computational complexity, thereby increasing the framework's practicality.

This paper's key contribution lies in demonstrating the stability and efficiency of ZDP through extensive experiments on the ARCHER platform, a 3D hopping robot. Through over 3000 hops, this framework maintained stable hopping and managed to overcome disturbances across various terrains. This empirical evidence supports the theoretical claims that controlling the zero dynamics can lead to effective stabilization and locomotion agility in underactuated systems.

Importantly, the paper provides rigorous mathematical proofs ensuring stability. The authors capitalize on the optimality of discrete-time zero dynamics to certify exponential stability of the system. This theoretical foundation propels ZDP as a viable option for future implementations on robots requiring agile and adaptive locomotion capabilities. It highlights the ambitious merge of control theory and machine learning, where learning the zero dynamics mapping through optimal control opens avenues for adaptable and robust control systems.

These findings underscore potential future implications in designing agile robots for applications that demand robust performance in uneven and unpredictable environments, such as search-and-rescue operations or planetary exploration. Furthermore, the demonstrated reduction in computational overheads suggests feasibility in applying these techniques to settings with strict real-time execution constraints.

In future efforts, this methodology might be extended to more diverse robotic platforms or integrated with reinforcement learning. The potential for learning-based refinements of ZDP could further enhance adaptive capabilities in dynamic environments. This cross-disciplinary blend of optimal control and machine learning proves promising, showing that the exploration and stabilization of zero dynamics can indeed foster advancements in underactuated robotic systems, thereby balanced between theoretical rigor and practical application.

PDF Markdown

Related Papers

Tweets

https://twitter.com/RoboReading/status/1836012233614131545

https://twitter.com/yisongyue/status/1846397835186688272

YouTube

Show All Videos