Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms (2210.16575v3)
Abstract: In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of safety-critical scenarios during the training phase could result in poor generalization performance in real-world driving applications. We propose a novel framework in which the weaknesses of the training set are explored through black-box verification methods. After discovering AD failure scenarios, the RL agent's training is re-initiated via transfer learning to improve the performance of previously unsafe scenarios. Simulation results demonstrate that our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method. The source code is publicly available at https://github.com/data-and-decision-lab/self-improving-RL.
- S. Kuutti, S. Fallah, and R. Bowden, “Training adversarial agents to exploit weaknesses in deep control policies,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 108–114.
- M. Kahn, A. Sarkar, and K. Czarnecki, “I know you can’t see me: Dynamic occlusion-aware safety validation of strategic planners for autonomous vehicles using hypergames,” in 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 11 202–11 208.
- Y. Bicer, A. Alizadeh, N. K. Ure, A. Erdogan, and O. Kizilirmak, “Sample efficient interactive end-to-end deep learning for self-driving cars with selective multi-class safe dataset aggregation,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 2629–2634.
- A. Li, L. Sun, W. Zhan, M. Tomizuka, and M. Chen, “Prediction-based reachability for collision avoidance in autonomous driving,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 7908–7914.
- J. Norden, M. O’Kelly, and A. Sinha, “Efficient black-box assessment of autonomous vehicle safety,” CoRR, vol. abs/1912.03618, 2019. [Online]. Available: http://arxiv.org/abs/1912.03618
- M. Koren, S. Alsaif, R. Lee, and M. J. Kochenderfer, “Adaptive stress testing for autonomous vehicles,” in 2018 IEEE Intelligent Vehicles Symposium (IV), 2018, pp. 1–7.
- Y. Abeysirigoonawardena, F. Shkurti, and G. Dudek, “Generating adversarial driving scenarios in high-fidelity simulators,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 8271–8277.
- M. O' Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi, “Scalable end-to-end autonomous vehicle testing via rare-event simulation,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/file/653c579e3f9ba5c03f2f2f8cf4512b39-Paper.pdf
- G. E. Mullins, P. G. Stankiewicz, R. C. Hawthorne, and S. K. Gupta, “Adaptive generation of challenging scenarios for testing and evaluation of autonomous vehicles,” Journal of Systems and Software, vol. 137, pp. 197–215, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0164121217302546
- A. Sinha, M. O’Kelly, R. Tedrake, and J. C. Duchi, “Neural bridge sampling for evaluating safety-critical autonomous systems,” Advances in Neural Information Processing Systems, vol. 33, pp. 6402–6416, 2020.
- A. Corso and M. J. Kochenderfer, “Interpretable safety validation for autonomous vehicles,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), 2020, pp. 1–6.
- L. C. Das and M. Won, “Saint-acc: Safety-aware intelligent adaptive cruise control for autonomous vehicles using deep reinforcement learning,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 2445–2455. [Online]. Available: https://proceedings.mlr.press/v139/das21a.html
- A. Alizadeh, M. Moghadam, Y. Bicer, N. K. Ure, U. Yavas, and C. Kurtulus, “Automated lane change decision making using deep reinforcement learning in dynamic and uncertain highway environment,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019, pp. 1399–1404.
- M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical review E, vol. 62, no. 2, p. 1805, 2000.
- X. Li, A. Rakotonirainy, and X. Yan, “How do drivers avoid collisions? a driving simulator-based study,” Journal of Safety Research, vol. 70, pp. 89–96, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022437518307333
- C. E. Rasmussen, “Gaussian processes in machine learning,” in Summer school on machine learning. Springer, 2003, pp. 63–71.
- J. B. Mockus and L. J. Mockus, “Bayesian approach to global optimization and application to multiobjective and constrained problems,” Journal of optimization theory and applications, vol. 70, no. 1, pp. 157–172, 1991.
- F. Cérou and A. Guyader, “Adaptive multilevel splitting for rare event analysis,” Stochastic Analysis and Applications, vol. 25, no. 2, pp. 417–443, 2007. [Online]. Available: https://doi.org/10.1080/07362990601139628
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” CoRR, vol. abs/1707.06347, 2017. [Online]. Available: http://arxiv.org/abs/1707.06347
- E. Leurent, “An environment for autonomous driving decision-making,” https://github.com/eleurent/highway-env, 2018.
- E. Liang, R. Liaw, R. Nishihara, P. Moritz, R. Fox, J. Gonzalez, K. Goldberg, and I. Stoica, “Ray rllib: A composable and scalable reinforcement learning library,” CoRR, vol. abs/1712.09381, 2017. [Online]. Available: http://arxiv.org/abs/1712.09381
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.