Distributionally Robust Policy and Lyapunov-Certificate Learning (2404.03017v2)
Abstract: This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation.
- K. Long, Y. Yi, J. Cortes, and N. Atanasov, “Distributionally robust Lyapunov function search under uncertainty,” in Learning for Dynamics and Control Conference, pp. 864–877, PMLR, 2023.
- Prentice hall Englewood Cliffs, NJ, 1991.
- Z. Artstein, “Stabilization with relaxed controls,” Nonlinear Analysis-theory Methods & Applications, vol. 7, pp. 1163–1173, 1983.
- E. Sontag, “A ‘universal’ construction of Artstein’s theorem on nonlinear stabilization,” Systems & Control Letters, vol. 13, no. 2, pp. 117–123, 1989.
- W. M. Haddad and V. Chellaboina, Nonlinear dynamical systems and control: a Lyapunov-based approach. Princeton university press, 2008.
- P. Parrilo, Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. California Institute of Technology, 2000.
- A. Papachristodoulou and S. Prajna, “On the construction of Lyapunov functions using the sum of squares decomposition,” in Proceedings of the 41st IEEE Conference on Decision and Control, 2002., vol. 3, pp. 3482–3487, 2002.
- Z. Jarvis-Wloszek, R. Feeley, W. Tan, K. Sun, and A. Packard, “Some controls applications of sum of squares programming,” in 42nd IEEE International Conference on Decision and Control, vol. 5, pp. 4676–4681, 2003.
- Y.-C. Chang, N. Roohi, and S. Gao, “Neural Lyapunov control,” in Advances in Neural Information Processing Systems, vol. 32, 2019.
- S. M. Richards, F. Berkenkamp, and A. Krause, “The Lyapunov neural network: Adaptive stability certification for safe learning of dynamical systems,” in Conference on Robot Learning, pp. 466–476, PMLR, 2018.
- H. Dai, B. Landry, L. Yang, M. Pavone, and R. Tedrake, “Lyapunov-stable neural-network control,” in Proceedings of Robotics: Science and Systems, (Virtual), July 2021.
- N. Gaby, F. Zhang, and X. Ye, “Lyapunov-net: A deep neural network architecture for Lyapunov function approximation,” in 2022 IEEE 61st Conference on Decision and Control (CDC), pp. 2091–2096, 2022.
- T. Li and N. Figueroa, “Task generalization with stability guarantees via elastic dynamical system motion policies,” in Conference on Robot Learning, pp. 3485–3517, PMLR, 2023.
- A. Taylor, V. Dorobantu, H. Le, Y. Yue, and A. Ames, “Episodic learning with control Lyapunov functions for uncertain robotic systems,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6878–6884, 2019.
- F. Castañeda, J. Choi, B. Zhang, C. Tomlin, and K. Sreenath, “Gaussian process-based min-norm stabilizing controller for control-affine systems with uncertain input effects and dynamics,” in 2021 American Control Conference (ACC), pp. 3683–3690, 2021.
- K. Long, V. Dhiman, M. Leok, J. Cortés, and N. Atanasov, “Safe control synthesis with uncertain dynamics and constraints,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7295–7302, 2022.
- P. Mestres and J. Cortés, “Feasibility and regularity analysis of safe stabilizing controllers under uncertainty,” Automatica, 2023. Submitted.
- P. Mestres, K. Long, N. Atanasov, and J. Cortés, “Feasibility analysis and regularity characterization of distributionally robust safe stabilizing controllers,” IEEE Control Systems Letters, vol. 8, pp. 91–96, 2024.
- Society for Industrial and Applied Mathematics, 2009.
- P. M. Esfahani and D. Kuhn, “Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations,” Mathematical Programming, vol. 171, pp. 115–166, 2018.
- A. R. Hota, A. K. Cherukuri, and J. Lygeros, “Data-driven chance constrained optimization under Wasserstein ambiguity sets,” in 2019 American Control Conference (ACC), pp. 1501–1506, 2019.
- Society for Industrial and Applied Mathematics, 1994.
- D. Hilbert, “Ueber die darstellung definiter formen als summe von formenquadraten,” Mathematische Annalen, vol. 32, pp. 342–350, September 1888.
- N. Boffi, S. Tu, N. Matni, J.-J. Slotine, and V. Sindhwani, “Learning stability certificates from data,” in Conference on Robot Learning, pp. 1341–1350, PMLR, 2021.
- R. Zhou, T. Quartz, H. De Sterck, and J. Liu, “Neural Lyapunov control of unknown nonlinear systems with stability guarantees,” in Advances in Neural Information Processing Systems, vol. 35, 2022.
- C. Dawson, Z. Qin, S. Gao, and C. Fan, “Safe nonlinear control using robust neural Lyapunov-barrier functions,” in Conference on Robot Learning, vol. 164, pp. 1724–1735, PMLR, 2022.
- C. Dawson, S. Gao, and C. Fan, “Safe control with learned certificates: A survey of neural Lyapunov, barrier, and contraction methods for robotics and control,” IEEE Transactions on Robotics, vol. 39, no. 3, pp. 1749–1767, 2023.
- T. Westenbroek, F. Castaneda, A. Agrawal, S. Sastry, and K. Sreenath, “Lyapunov design for robust and efficient robotic reinforcement learning,” arXiv preprint arXiv:2208.06721, 2022.
- A. Lopez and D. Fridovich-Keil, “Decomposing control lyapunov functions for efficient reinforcement learning,” arXiv preprint arXiv:2403.12210, 2024.
- B. P. G. Van Parys, D. Kuhn, P. J. Goulart, and M. Morari, “Distributionally robust control of constrained stochastic systems,” IEEE Transactions on Automatic Control, vol. 61, no. 2, pp. 430–442, 2016.
- R. Jiang and Y. Guan, “Data-driven chance constrained stochastic program,” Mathematical Programming, vol. 158, pp. 291–327, 2016.
- W. Xie, “On distributionally robust chance constrained programs with Wasserstein distance,” Math. Program., vol. 186, pp. 115–155, 2021.
- S. Sagawa*, P. W. Koh*, T. B. Hashimoto, and P. Liang, “Distributionally robust neural networks,” in International Conference on Learning Representations, 2020.
- S. Levine, A. Kumar, G. Tucker, and J. Fu, “Offline reinforcement learning: Tutorial, review, and perspectives on open problems,” arXiv preprint arXiv:2005.01643, 2020.
- F. Boso, D. Boskos, J. Cortés, S. Martínez, and D. M. Tartakovsky, “Dynamics of data-driven ambiguity sets for hyperbolic conservation laws with uncertain inputs,” SIAM Journal on Scientific Computing, vol. 43, no. 3, pp. A2102–A2129, 2021.
- D. Boskos, J. Cortés, and S. Martinez, “Data-driven ambiguity sets with probabilistic guarantees for dynamic processes,” IEEE Transactions on Automatic Control, vol. 66, no. 7, pp. 2991–3006, 2021.
- B. Li, Y. Tan, A.-G. Wu, and G.-R. Duan, “A distributionally robust optimization based method for stochastic model predictive control,” IEEE Transactions on Automatic Control, vol. 67, no. 11, pp. 5762–5776, 2021.
- A. Cherukuri and A. R. Hota, “Consistency of distributionally robust risk- and chance-constrained optimization under Wasserstein ambiguity sets,” IEEE Control Systems Letters, vol. 5, no. 5, pp. 1729–1734, 2021.
- I. Yang, “Wasserstein distributionally robust stochastic control: A data-driven approach,” IEEE Transactions on Automatic Control, vol. 66, no. 8, pp. 3863–3870, 2020.
- D. Boskos, J. Cortés, and S. Martínez, “High-confidence data-driven ambiguity sets for time-varying linear systems,” IEEE Transactions on Automatic Control, vol. 69, no. 2, pp. 797–812, 2024.
- A. Z. Ren and A. Majumdar, “Distributionally robust policy learning via adversarial environment generation,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 1379–1386, 2022.
- J. Coulson, J. Lygeros, and F. Dörfler, “Distributionally robust chance constrained data-enabled predictive control,” IEEE Transactions on Automatic Control, vol. 67, no. 7, pp. 3289–3304, 2021.
- K. Long, Y. Yi, J. Cortés, and N. Atanasov, “Safe and stable control synthesis for uncertain system models via distributionally robust optimization,” in 2023 American Control Conference (ACC), pp. 4651–4658, 2023.
- L. Aolaritei, M. Fochesato, J. Lygeros, and F. Dörfler, “Wasserstein tube mpc with exact uncertainty propagation,” in 2023 62nd IEEE Conference on Decision and Control (CDC), pp. 2036–2041, IEEE, 2023.
- A. B. Kordabad, R. Wisniewski, and S. Gros, “Safe reinforcement learning using Wasserstein distributionally robust MPC and chance constraint,” IEEE Access, vol. 10, pp. 130058–130067, 2022.
- A. Hakobyan and I. Yang, “Distributionally robust differential dynamic programming with Wasserstein distance,” IEEE Control Systems Letters, vol. 7, pp. 2329–2334, 2023.
- T. Summers, “Distributionally robust sampling-based motion planning under uncertainty,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6518–6523, 2018.
- P. Lathrop, B. Boardman, and S. Martínez, “Distributionally safe path planning: Wasserstein safe RRT,” IEEE Robotics and Automation Letters, vol. 7, no. 1, pp. 430–437, 2021.
- Springer Science & Business Media, 2013.
- R. Freeman and P. Kototovic, Robust Nonlinear Control Design. Cambridge, MA, USA: Birkhäuser Boston Inc., 1996.
- R. T. Rockafellar and S. Uryasev, “Optimization of conditional value-at-risk,” Journal of Risk, vol. 2, pp. 21–41, 2000.
- A. Nemirovski and A. Shapiro, “Convex approximations of chance constrained programs,” SIAM J. Optim., vol. 17, pp. 969–996, 2006.
- Academic press New York, 1967.
- A. R. Teel, J. P. Hespanha, and A. Subbaraman, “A converse Lyapunov theorem and robustness for asymptotic stability in probability,” IEEE Transactions on Automatic Control, vol. 59, no. 9, pp. 2426–2441, 2014.
- P. Culbertson, R. K. Cosner, M. Tucker, and A. D. Ames, “Input-to-state stability in probability,” in 2023 62nd IEEE Conference on Decision and Control (CDC), pp. 5796–5803, IEEE, 2023.
- J. Steinhardt and R. Tedrake, “Finite-time regional verification of stochastic non-linear systems,” The International Journal of Robotics Research, vol. 31, no. 7, pp. 901–923, 2012.
- C. Santoyo, M. Dutreix, and S. Coogan, “A barrier function approach to finite-time stochastic system verification and control,” Automatica, vol. 125, p. 109439, 2021.
- M. Fazlyab, A. Robey, H. Hassani, M. Morari, and G. Pappas, “Efficient and accurate estimation of lipschitz constants for deep neural networks,” in Advances in Neural Information Processing Systems, vol. 32, 2019.
- M. Towers, J. K. Terry, A. Kwiatkowski, J. U. Balis, G. d. Cola, T. Deleu, M. Goulão, A. Kallinteris, A. KG, M. Krimmel, R. Perez-Vicente, A. Pierré, S. Schulhoff, J. J. Tai, A. T. J. Shen, and O. G. Younis, “Gymnasium,” Mar. 2023.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in neural information processing systems, vol. 32, 2019.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2015.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning, pp. 1861–1870, PMLR, 2018.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- A. W. Moore, “Efficient memory-based learning for robot control,” tech. rep., University of Cambridge, 1990.