Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates (2405.14058v2)
Abstract: Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the ``black box'' nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certificates are typically difficult to learn and even more difficult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certificates for discrete-time systems. Specifically, we introduce a technique for certificate composition, which simplifies the verification of highly-complex systems by strategically designing a sequence of certificates. When jointly verified with neural network verification engines, these certificates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certificate filtering, which significantly simplifies the process of producing formally verified certificates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft.
- Fossil: a software tool for the formal synthesis of lyapunov functions and barrier certificates using neural networks. In Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control, pages 1–11, 2021.
- Formal synthesis of lyapunov neural networks. IEEE Control Systems Letters, 5(3):773–778, 2020.
- Safe reach set computation via neural barrier certificates. arXiv preprint arXiv:2404.18813, 2024.
- Automated and sound synthesis of lyapunov functions with smt solvers. In Tools and Algorithms for the Construction and Analysis of Systems: 26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25–30, 2020, Proceedings, Part I 26, pages 97–114. Springer, 2020.
- B. Alpern and F. Schneider. Recognizing safety and liveness. Distributed Computing, 2:117–126, 09 1987.
- Control barrier function based quadratic programs for safety critical systems. Trans. on Automatic Control, 2017.
- Control barrier functions: Theory and applications. In European Control Conf., 2019.
- Verifying Learning-Based Robotic Navigation Systems. In Proc. 29th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 607–627, 2023.
- veriFIRE: Verifying an Industrial, Learning-Based Wildfire Detection System. In Proc. 25th Int. Symposium on Formal Methods (FM), pages 648–656, 2023.
- Verifying Generalization in Deep Learning. In Proc. 35th Int. Conf. on Computer Aided Verification (CAV), pages 438–455, 2023.
- Towards Scalable Verification of Deep Reinforcement Learning. In Proc. 21st Int. Conf. on Formal Methods in Computer-Aided Design (FMCAD), pages 193–203, 2021.
- An SMT-Based Approach for Verifying Binarized Neural Networks. In Proc. 27th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 203–222, 2021.
- Verification-Aided Deep Ensemble Selection. In Proc. 22nd Int. Conf. on Formal Methods in Computer-Aided Design (FMCAD), pages 27–37, 2022.
- Hamilton-Jacobi reachability: A brief overview and recent advances. In Conf. on Decision and Control, 2017.
- G. Basile and G. Marro. Controlled and conditioned invariant subspaces in linear system theory. Journal of Optimization Theory and Applications, 3:306–315, 1969.
- Formally Explaining Neural Networks within Reactive Systems. In Proc. 23rd Int. Conf. on Formal Methods in Computer-Aided Design (FMCAD), pages 10–22, 2023.
- Learning stability certificates from data. In Conference on Robot Learning, pages 1341–1350. PMLR, 2021.
- The fourth international verification of neural networks competition (vnn-comp 2023): Summary and results. arXiv preprint arXiv:2312.16760, 2023.
- Safe learning in robotics: From learning-based control to safe reinforcement learning. Annual Review of Control, Robotics, and Autonomous Systems, 5:411–444, 2022.
- Linearization of euclidean norm dependent inequalities applied to multibeam satellites design. Computational Optimization and Applications, 73:679–705, 2019.
- Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods, 2024.
- Neural Network Robustness as a Verification Property: A Principled Case Study. In Proc. 34th Int. Conf. on Computer Aided Verification (CAV), pages 219–231, 2022.
- Gaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with Uncertain Input Effects. arXiv, Nov 2020.
- Y.-C. Chang and S. Gao. Stabilizing neural control using self-learned almost lyapunov critics. 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 1803–1809, 2021.
- Neural lyapunov control. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions. In Robotics: Science and Systems. Robotics: Science and Systems, Apr 2020.
- A lyapunov-based approach to safe reinforcement learning. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 8092–8101. Curran Associates, Inc., 2018.
- W. Clohessy and R. Wiltshire. Terminal guidance system for satellite rendezvous. Journal of the aerospace sciences, 27(9):653–658, 1960.
- Analyzing Adversarial Inputs in Deep Reinforcement Learning, 2024. Technical Report. https://arxiv.org/abs/2402.05284.
- Safe control with learned certificates: A survey of neural lyapunov, barrier, and contraction methods for robotics and control. IEEE Transactions on Robotics, 2023.
- Safe nonlinear control using robust neural lyapunov-barrier functions. In Conference on Robot Learning, pages 1724–1735. PMLR, 2022.
- J. L. C. B. de Farias and W. M. Bessa. Intelligent control with artificial neural networks for automated insulin delivery systems. Bioengineering, 9(11):664, 2022.
- Fossil 2.0: Formal certificate synthesis for the verification and control of dynamical models. arXiv preprint arXiv:2311.09793, 2023.
- A general verification framework for dynamical and control models via certificate synthesis, 2023.
- Reach-avoid problems with time-varying dynamics, targets and constraints. In Proceedings of the 18th international conference on hybrid systems: computation and control, pages 11–20, 2015.
- Iterative reachability estimation for safe reinforcement learning. In Advances in Neural Information Processing Systems, 2023.
- Learning stabilization control from observations by learning lyapunov-like proxy models. 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 2913–2920, 2023.
- dreal: An smt solver for nonlinear theories over the reals. In International conference on automated deduction, pages 208–214. Springer, 2013.
- P. Giesl and S. Hafstein. Review on computational methods for lyapunov functions. Discrete and Continuous Dynamical Systems-B, 20(8):2291–2331, 2015.
- Deep Learning. MIT Press, 2016.
- Augmented neural lyapunov control. IEEE Access, 2023.
- Systematic synthesis of passive fault-tolerant augmented neural lyapunov control laws for nonlinear systems. In 2023 62nd IEEE Conference on Decision and Control (CDC), pages 5851–5856. IEEE, 2023.
- W. Haddad and V. Chellaboina. Nonlinear dynamical systems and control: A lyapunov-based approach. Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach, 01 2008.
- A real-time model-based reinforcement learning architecture for robot control, 2011.
- G. W. Hill. Researches in the lunar theory. American journal of Mathematics, 1(1):5–26, 1878.
- Safety and liveness guarantees through reach-avoid reinforcement learning. In Proceedings of Robotics: Science and Systems, Virtual, 7 2021.
- A neural lyapunov approach to transient stability assessment of power electronics-interfaced networked microgrids. IEEE transactions on smart grid, 13(1):106–118, 2021.
- Some controls applications of sum of squares programming. In 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), volume 5, pages 4676–4681 Vol.5, Dec 2003.
- Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In Proc. 29th Int. Conf. on Computer Aided Verification (CAV), pages 97–117, 2017.
- Reluplex: a Calculus for Reasoning about Deep Neural Networks. Formal Methods in System Design (FMSD), 2021.
- The Marabou Framework for Verification and Analysis of Deep Neural Networks. In Proc. 31st Int. Conf. on Computer Aided Verification (CAV), pages 443–452, 2019.
- O. Lahav and G. Katz. Pruning and Slicing Neural Networks using Formal Verification. In Proc. 21st Int. Conf. on Formal Methods in Computer-Aided Design (FMCAD), pages 183–192, 2021.
- M. Landers and A. Doryab. Deep reinforcement learning verification: A survey. ACM Comput. Surv., 55(14s), jul 2023.
- A survey on the control lyapunov function and control barrier function for nonlinear-affine control systems. IEEE/CAA Journal of Automatica Sinica, 10(3):584–602, 2023.
- Y. Li. Deep Reinforcement Learning: An Overview, 2017. Technical Report. http://arxiv.org/abs/1701.07274.
- J. Lu. Protein folding structure prediction using reinforcement learning with application to both 2d and 3d environments. In Proceedings of the 5th International Conference on Computer Science and Software Engineering, CSSE ’22, page 534–542, New York, NY, USA, 2022. Association for Computing Machinery.
- A. M. Lyapunov. The general problem of motion stability. Annals of Mathematics Studies, 17(1892), 1892.
- Fastened Crown: Tightened Neural Network Robustness Certificates. In Proc. 34th AAAI Conf. on Artificial Intelligence (AAAI), pages 5037–5044, 2020.
- A. Majumdar and R. Tedrake. Funnel libraries for real-time robust feedback motion planning. The International Journal of Robotics Research, 36(8):947–982, 2017.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Playing atari with deep reinforcement learning, 2013.
- Automated and formal synthesis of neural barrier certificates for dynamical models. In International conference on tools and algorithms for the construction and analysis of systems, pages 370–388. Springer, 2021.
- Quantifying safety of learning-based self-driving control using almost-barrier functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 12903–12910. IEEE, 2022.
- Learning safe multi-agent control with decentralized neural barrier certificates. In ICLR, 2021.
- M. Rada and M. Cerny. A new algorithm for enumeration of cells of hyperplane arrangements and a comparison with avis and fukuda’s reverse search. SIAM Journal on Discrete Mathematics, 32(1):455–473, 2018.
- Safe reinforcement learning benchmark environments for aerospace control systems. In 2022 IEEE Aerospace Conference (AERO), pages 1–20. IEEE, 2022.
- I. Refaeli and G. Katz. Minimal Multi-Layer Modifications of Deep Neural Networks, 2021. Technical Report. https://arxiv.org/abs/2110.09929.
- The lyapunov neural network: Adaptive stability certification for safe learning of dynamical systems. In Proceedings of The 2nd Conference on Robot Learning, volume 87 of Proceedings of Machine Learning Research, pages 466–476, 29–31 Oct 2018.
- P. Samanipour and H. A. Poonawala. Stability analysis and controller synthesis using single-hidden-layer relu neural networks. IEEE Transactions on Automatic Control, 2023.
- Overt: An algorithm for safety verification of neural network control policies for nonlinear systems. Journal of Machine Learning Research, 23(117):1–45, 2022.
- O. So and C. Fan. Solving stabilize-avoid optimal control via epigraph form and deep reinforcement learning. In Proceedings of Robotics: Science and Systems, 2023.
- Exploring applications of deep reinforcement learning for real-world autonomous driving systems, 2019.
- Enforcing safety for vision-based controllers via control barrier functions and neural radiance fields. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10511–10517. IEEE, 2023.
- Marabou 2.0: A versatile formal analyzer of neural networks. arXiv preprint arXiv:2401.14461, 2024.
- Neural lyapunov control for discrete-time systems. Advances in Neural Information Processing Systems, 36:2939–2955, 2023.
- Robustness of control barrier functions for safety critical control. Int. Federation of Automatic Control, 2015.
- Model-free safe reinforcement learning through neural barrier certificate. IEEE Robotics and Automation Letters, 2023.
- Reachability constrained reinforcement learning. In International Conference on Machine Learning, pages 25636–25655. PMLR, 2022.
- Sequential neural barriers for scalable dynamic obstacle avoidance. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 11241–11248. IEEE, 2023.
- Exact verification of relu neural control barrier functions. Advances in Neural Information Processing Systems, 36, 2024.
- Neural lyapunov control of unknown nonlinear systems with stability guarantees. Advances in Neural Information Processing Systems, 35:29113–29125, 2022.