2000 character limit reached
Control Lyapunov Function (CLF) Overview
Updated 22 July 2025
- Control Lyapunov Function (CLF) is a mathematical tool that guarantees system stability by ensuring its energy-like function decreases over time.
- Experimental results on nonlinear and satellite systems demonstrate that a tailored LQR-designed CLF lowers cost metrics, improves convergence, and focuses exploration on safer trajectories.
- Integrating adaptive constraint strength and vibration-dampening terms in the CLF framework enhances robustness by reducing control input oscillations and maintaining safety under disturbances.
) + 1/(β+1) uRL(t) * The safety constraint guarantees that the evolution of the CLF remains negative, thereby enforcing stability.
- Experimental Results The framework was tested on two types of systems: a classical 2D nonlinear system (NCT system) and a satellite attitude control problem.
- In the NCT system experiments, the authors compared the performance of their customized CLF (designed via LQR) against a unit matrix CLF. Although both methods yielded safe behavior, the tailored CLF focused the exploration on safer trajectories and resulted in lower cost metrics over repeated simulations—demonstrating faster convergence and reduced variability.
- In the satellite attitude control experiments, the customized CLF had a particularly pronounced effect. Here, the system’s complex dynamics benefit considerably from a CLF that properly weighs the state errors. The experimental results showed that cost functions based on the LQR-designed CLF achieved lower values and improved stability compared to those using a simple unit matrix. This indicates fewer deviations from the desired attitude and better compliance with the safety constraints even under disturbances.
- Moreover, experiments validating adaptive constraint strength demonstrated that when the system underwent disturbances, the dynamic adjustment of η(t) led to a tighter safety envelope (lower cost variability and more consistent performance), compared to when constant constraints were used.
- Similarly, the inclusion of the vibration-dampening term was found to significantly reduce oscillations in the control input. Both in the NCT system and in satellite control, smoother command profiles led to improved robustness and reduced actuator wear without sacrificing safety or performance.
In summary, SAC-CLF enhances standard SAC by embedding CLF-based safety constraints within a QP formulation. This integration:
- Uses a system-specific, quadratic CLF (V(e)= eᵀPe) designed from linearization and LQR,
- Dynamically adjusts the constraint strength (via η(t) modulated by δ(t)) to compensate for model uncertainties,
- And smoothes the control inputs by penalizing abrupt changes.
Experimental results on both a nonlinear 2D system and satellite attitude control demonstrate that SAC-CLF consistently achieves lower cost metrics (i.e. better performance) along with enhanced stability and safety, outperforming traditional methods that lack these integrated safety guarantees. This comprehensive design framework thus provides an effective and principled way to incorporate safety into reinforcement learning for control tasks with challenging dynamics.