Verified Safe Reinforcement Learning for Neural Network Dynamic Models (2405.15994v2)
Abstract: Learning reliably safe autonomous control is one of the core problems in trustworthy autonomy. However, training a controller that can be formally verified to be safe remains a major challenge. We introduce a novel approach for learning verified safe control policies in nonlinear neural dynamical systems while maximizing overall performance. Our approach aims to achieve safety in the sense of finite-horizon reachability proofs, and is comprised of three key parts. The first is a novel curriculum learning scheme that iteratively increases the verified safe horizon. The second leverages the iterative nature of gradient-based learning to leverage incremental verification, reusing information from prior verification runs. Finally, we learn multiple verified initial-state-dependent controllers, an idea that is especially valuable for more complex domains where learning a single universal verified safe controller is extremely challenging. Our experiments on five safe control problems demonstrate that our trained controllers can achieve verified safety over horizons that are as much as an order of magnitude longer than state-of-the-art baselines, while maintaining high reward, as well as a perfect safety record over entire episodes. Our code is available at https://github.com/jlwu002/VSRL.
- Constrained policy optimization. In International conference on machine learning, pages 22–31. PMLR, 2017.
- Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Verifiable reinforcement learning via policy extraction. Advances in neural information processing systems, 31, 2018.
- Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41–48, 2009.
- Safe model-based reinforcement learning with stability guarantees. In Proceedings of the 31st International Conference on Neural Information Processing Systems, page 908–919, 2017.
- A lyapunov-based approach to safe reinforcement learning. Advances in neural information processing systems, 31, 2018.
- Lyapunov-stable neural-network control. In Dylan A. Shell, Marc Toussaint, and M. Ani Hsieh, editors, Robotics: Science and Systems XVII, Virtual Event, July 12-16, 2021, 2021.
- Safe nonlinear control using robust neural lyapunov-barrier functions. In Conference on Robot Learning, pages 1724–1735. PMLR, 2022.
- A dual approach to scalable verification of deep networks. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018, pages 550–559, 2018.
- Efficient neural network verification with exactness characterization. In Uncertainty in artificial intelligence, pages 497–507. PMLR, 2020.
- Robustness analysis of neural networks via efficient partitioning with applications in control systems. IEEE Control Systems Letters, 5(6):2114–2119, 2020.
- Complete verification via multi-neuron relaxation guided branch-and-bound. In International Conference on Learning Representations, 2022.
- Safe reinforcement learning via formal methods: Toward safe control through proof and learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Iterative reachability estimation for safe reinforcement learning. Advances in Neural Information Processing Systems, 36, 2024.
- Ai2: Safety and robustness certification of neural networks with abstract interpretation. In 2018 IEEE Symposium on Security and Privacy (SP), pages 3–18, 2018.
- Learning nonlinear dynamic models of soft robots for model predictive control with neural networks. In 2018 IEEE International Conference on Soft Robotics (RoboSoft), pages 39–45. IEEE, 2018.
- On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715, 2018.
- A review of safe reinforcement learning: Methods, theory and applications. arXiv preprint arXiv:2205.10330, 2022.
- Safety verification of deep neural networks. In Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30, pages 3–29. Springer, 2017.
- Verisig: verifying safety properties of hybrid systems with neural network controllers. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pages 169–178, 2019.
- Safe reinforcement learning using probabilistic shields. In 31st International Conference on Concurrency Theory (CONCUR 2020). Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2020.
- Model-based safe deep reinforcement learning via a constrained proximal policy optimization algorithm. Advances in Neural Information Processing Systems, 35:24432–24445, 2022.
- Reluplex: An efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, pages 97–117, 2017.
- The marabou framework for verification and analysis of deep neural networks. In Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31, pages 443–452. Springer, 2019.
- Provably safe reinforcement learning via action projection using reachability analysis and polynomial zonotopes. IEEE Open Journal of Control Systems, 2:79–92, 2023.
- Kinematic and dynamic vehicle models for autonomous driving control design. In 2015 IEEE intelligent vehicles symposium (IV), pages 1094–1099. IEEE, 2015.
- Towards scalable complete verification of relu neural networks via dependency-based branching. In IJCAI, pages 2643–2650, 2021.
- Model-based control with sparse neural dynamics. Advances in Neural Information Processing Systems, 36, 2024.
- Nnv 2.0: the neural network verification tool. In International Conference on Computer Aided Verification, pages 397–412. Springer, 2023.
- Conservative and adaptive penalty for model-based safe reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 5404–5412, 2022.
- Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In 2018 IEEE international conference on robotics and automation (ICRA), pages 7559–7566. IEEE, 2018.
- Safe adaptation with multiplicative uncertainties using robust safe set algorithm. IFAC-PapersOnLine, 54(20):360–365, 2021.
- Contactnets: Learning discontinuous contact dynamics with smooth, implicit representations. In Conference on Robot Learning, pages 2279–2291. PMLR, 2021.
- Verification of non-linear specifications for neural networks. In International Conference on Learning Representations, 2019.
- An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages, 3(POPL):41, 2019.
- Solving stabilize-avoid optimal control via epigraph form and deep reinforcement learning. In Proceedings of Robotics: Science and Systems, 2023.
- Responsive safety in reinforcement learning by pid lagrangian methods. In International Conference on Machine Learning, pages 9133–9143. PMLR, 2020.
- Evaluating robustness of neural networks with mixed integer programming. In International Conference on Learning Representations, 2019.
- Linear model predictive safety certification for learning-based control. In 2018 IEEE Conference on Decision and Control (CDC), pages 7130–7135. IEEE, 2018.
- Beta-CROWN: Efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification. Advances in Neural Information Processing Systems, 34, 2021.
- Enforcing hard constraints with soft barriers: safe reinforcement learning in unknown stochastic environments. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR, 2023.
- Safe control with neural network dynamic models. In Learning for Dynamics and Control Conference, pages 739–750. PMLR, 2022.
- Persistently feasible robust safe control by safety index synthesis and convex semi-infinite programming. IEEE Control Systems Letters, 7:1213–1218, 2022.
- Provably safe reinforcement learning with step-wise violation constraints. Advances in Neural Information Processing Systems, 36, 2024.
- Fast and Complete: Enabling complete neural network verification with rapid and massively parallel incomplete verifiers. In International Conference on Learning Representations, 2021.
- Reachability constrained reinforcement learning. In International Conference on Machine Learning, pages 25636–25655. PMLR, 2022.
- Efficient neural network robustness certification with general activation functions. In Advances in Neural Information Processing Systems, pages 4944–4953, 2018.
- General cutting planes for bound-propagation-based neural network verification. In Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022.