Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Probabilistic Reach-Avoid for Bayesian Neural Networks (2310.01951v1)

Published 3 Oct 2023 in cs.LG and cs.AI

Abstract: Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica, 44(11):2724–2734, 2008.
  2. Formal control synthesis for stochastic neural network dynamic models. arXiv preprint arXiv:2203.05903, 2022.
  3. Safe reinforcement learning via shielding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  4. Feedback Systems: An Introduction for Scientists and Engineers. Princeton University Press, Princeton, NJ, USA, 2008.
  5. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning, pages 274–283. PMLR, 2018.
  6. Run-time optimization for learned controllers through quantitative games. In International Conference on Computer Aided Verification, pages 630–649. Springer, 2019.
  7. Probabilistic guarantees for safe deep reinforcement learning. In International Conference on Formal Modeling and Analysis of Timed Systems, pages 231–248. Springer, 2020.
  8. Safe controller optimization for quadrotors with Gaussian processes. In Proc. of the IEEE International Conference on Robotics and Automation (ICRA), pages 493–496, 2016.
  9. Safe model-based reinforcement learning with stability guarantees. In NIPS, 2017.
  10. Make sure you’re unsure: A framework for verifying probabilistic specifications. Advances in Neural Information Processing Systems, 34:11136–11147, 2021.
  11. Stochastic optimal control: the discrete-time case. Athena Scientific, 2004.
  12. Adversarial robustness guarantees for classification with Gaussian processes. In International Conference on Artificial Intelligence and Statistics, pages 3372–3382. PMLR, 2020.
  13. Weight uncertainty in neural network. In International Conference on Machine Learning, pages 1613–1622. PMLR, 2015.
  14. Reinforcement learning with probabilistic guarantees for autonomous driving. arXiv preprint arXiv:1904.07189, 2019.
  15. Robustness of bayesian neural networks to gradient-based attacks. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 15602–15613. Curran Associates, Inc., 2020.
  16. Statistical guarantees for the robustness of bayesian neural networks. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 5693–5700. International Joint Conferences on Artificial Intelligence Organization, 7 2019a. doi: 10.24963/ijcai.2019/789. URL https://doi.org/10.24963/ijcai.2019/789.
  17. Robustness guarantees for Bayesian inference with Gaussian processes. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7759–7768, 2019b.
  18. Efficiency through uncertainty: Scalable formal synthesis for stochastic hybrid systems. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pages 240–251, 2019.
  19. PILCO: A model-based and data-efficient approach to policy search. In In Proceedings of the International Conference on Machine Learning, 2011.
  20. Learning and policy search in stochastic dynamical systems with bayesian neural networks. In 5th International Conference on Learning Representations, ICLR 2017-Conference Track Proceedings, 2017.
  21. Guido Fubini. Sugli integrali multipli. Rend. Acc. Naz. Lincei, 16:608–614, 1907.
  22. Verifiably safe off-model reinforcement learning. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 413–430. Springer, 2019.
  23. Improving PILCO with Bayesian neural network dynamics models. In International Conference in Machine Learning (ICML), 2016a.
  24. Improving PILCO with Bayesian neural network dynamics models. In Data-Efficient Machine Learning workshop, volume 951, page 2016, 2016b.
  25. Ai2: Safety and robustness certification of neural networks with abstract interpretation. In 2018 IEEE S&P, pages 3–18. IEEE, 2018.
  26. Gaussian process priors with uncertain inputs application to multiple-step ahead time series forecasting. In Advances in neural information processing systems, pages 545–552, 2003.
  27. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256. JMLR Workshop and Conference Proceedings, 2010.
  28. Deep learning. MIT press, 2016.
  29. On the effectiveness of interval bound propagation for training verifiably robust models. Neural Information Processing Systems (NeurIPS), 2018.
  30. How wrong am i?-studying adversarial examples and their impact on uncertainty in gaussian process machine learning models. arXiv preprint arXiv:1711.06598, 2017.
  31. Data-driven and model-based verification via Bayesian identification and reachability analysis. Automatica, 79(5):115–126, 2017.
  32. Deep reinforcement learning with temporal logics. In Proceedings of FORMATS, LNCS 12288, pages 1–22, 2020.
  33. Certified reinforcement learning with logic guidance. arXiv preprint arXiv:1902.00778, 2019.
  34. Deep vs. deep bayesian: Reinforcement learning on a multi-robot competitive experiment. arXiv preprint arXiv:2007.10675, 2020.
  35. What are bayesian neural network posteriors really like? In International conference on machine learning, pages 4629–4640. PMLR, 2021.
  36. Safety verification of unknown dynamical systems via gaussian process regression. In 2020 59th IEEE Conference on Decision and Control (CDC), pages 860–866. IEEE, 2020.
  37. Reluplex: An efficient SMT solver for verifying deep neural networks. In International Conference on Computer Aided Verification, pages 97–117. Springer, 2017.
  38. The bayesian learning rule. arXiv preprint arXiv:2107.04562, 2021.
  39. Safe reinforcement learning using probabilistic shields. In International Conference on Concurrency Theory: 31st CONCUR 2020: Vienna, Austria (Virtual Conference). Schloss Dagstuhl-Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2020.
  40. Stochastic model checking. In International School on Formal Methods for the Design of Computer, Communication and Software Systems, pages 220–270. Springer, 2007.
  41. Infinite time horizon safety of bayesian neural networks. Advances in Neural Information Processing Systems, 34, 2021.
  42. Faming Liang. Bayesian neural networks for nonlinear time series forecasting. Statistics and computing, 15(1):13–29, 2005.
  43. Adv-bnn: Improved adversarial defense through robust Bayesian neural network. 7th International Conference on Learning Representations, ICLR 2019-Conference Track Proceedings, 2019.
  44. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  45. Andres Masegosa. Learning under model misspecification: Applications to variational and ensemble methods. Advances in Neural Information Processing Systems, 33:5479–5491, 2020.
  46. Data-efficient reinforcement learning in continuous state-action gaussian-pomdps. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  47. Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 7344–7350. IEEE, 2020.
  48. Differentiable abstract interpretation for provably robust neural networks. In International Conference on Machine Learning, pages 3578–3586. PMLR, 2018.
  49. Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.
  50. Posterior distribution analysis for bayesian inference in neural networks. In Workshop on Bayesian Deep Learning, NIPS, 2016.
  51. Radford M Neal. Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012.
  52. Radford M Neal et al. Mcmc using hamiltonian dynamics. Handbook of markov chain monte carlo, 2(11):2, 2011.
  53. Neural simplex architecture. In NASA Formal Methods Symposium, pages 97–114. Springer, 2020.
  54. Safe policy search using Gaussian process models. In Proceedings of the 18th International Conference on Autonomous Agents and Multi Agent Systems, pages 1565–1573. IFAAMS, 2019.
  55. Safety guarantees for iterative predictions with Gaussian processes. In 2020 59th IEEE Conference on Decision and Control (CDC), pages 3187–3193. IEEE, 2020.
  56. Mastering atari, go, chess and shogi by planning with a learned model. corr abs/1911.08265 (2019). arXiv preprint arXiv:1911.08265, 2019.
  57. S. Esmaeil Zadeh Soudjani and A. Abate. Probabilistic reach-avoid computation for partially-degenerate stochastic processes. IEEE Transactions on Automatic Control, 58(12):528–534, 2013.
  58. Formal verification of neural network controlled autonomous systems. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pages 147–156, 2019.
  59. Reinforcement learning: An introduction, 1998.
  60. Stability of controllers for gaussian process forward models. In International Conference on Machine Learning, pages 545–554. PMLR, 2016.
  61. Safe control with neural network dynamic models. arXiv preprint arXiv:2110.01110, 2021.
  62. Matthew Wicker. Adversarial robustness of Bayesian neural networks. PhD thesis, University of Oxford, 2021.
  63. Probabilistic safety for bayesian neural networks. In Jonas Peters and David Sontag, editors, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), volume 124 of Proceedings of Machine Learning Research, pages 1198–1207. PMLR, 03–06 Aug 2020.
  64. Bayesian inference with certifiable adversarial robustness. In Arindam Banerjee and Kenji Fukumizu, editors, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 2431–2439. PMLR, 13–15 Apr 2021a.
  65. Certification of iterative predictions in bayesian neural networks. In Uncertainty in Artificial Intelligence, pages 1713–1723. PMLR, 2021b.
  66. Robust explanation constraints for neural networks. In The Eleventh International Conference on Learning Representations, 2022.
  67. V. Wijesuriya and A. Abate. Bayes-adaptive planning for data-efficient verification of uncertain Markov decision processes. In Proceedings of QEST, LNCS 11785, pages 91–108, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.