Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 187 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 104 tok/s Pro
Kimi K2 177 tok/s Pro
GPT OSS 120B 436 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms (2210.16575v3)

Published 29 Oct 2022 in cs.AI, cs.LG, and cs.RO

Abstract: In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of safety-critical scenarios during the training phase could result in poor generalization performance in real-world driving applications. We propose a novel framework in which the weaknesses of the training set are explored through black-box verification methods. After discovering AD failure scenarios, the RL agent's training is re-initiated via transfer learning to improve the performance of previously unsafe scenarios. Simulation results demonstrate that our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method. The source code is publicly available at https://github.com/data-and-decision-lab/self-improving-RL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. S. Kuutti, S. Fallah, and R. Bowden, “Training adversarial agents to exploit weaknesses in deep control policies,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 108–114.
  2. M. Kahn, A. Sarkar, and K. Czarnecki, “I know you can’t see me: Dynamic occlusion-aware safety validation of strategic planners for autonomous vehicles using hypergames,” in 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 11 202–11 208.
  3. Y. Bicer, A. Alizadeh, N. K. Ure, A. Erdogan, and O. Kizilirmak, “Sample efficient interactive end-to-end deep learning for self-driving cars with selective multi-class safe dataset aggregation,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 2629–2634.
  4. A. Li, L. Sun, W. Zhan, M. Tomizuka, and M. Chen, “Prediction-based reachability for collision avoidance in autonomous driving,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 7908–7914.
  5. J. Norden, M. O’Kelly, and A. Sinha, “Efficient black-box assessment of autonomous vehicle safety,” CoRR, vol. abs/1912.03618, 2019. [Online]. Available: http://arxiv.org/abs/1912.03618
  6. M. Koren, S. Alsaif, R. Lee, and M. J. Kochenderfer, “Adaptive stress testing for autonomous vehicles,” in 2018 IEEE Intelligent Vehicles Symposium (IV), 2018, pp. 1–7.
  7. Y. Abeysirigoonawardena, F. Shkurti, and G. Dudek, “Generating adversarial driving scenarios in high-fidelity simulators,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 8271–8277.
  8. M. O' Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi, “Scalable end-to-end autonomous vehicle testing via rare-event simulation,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31.   Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/file/653c579e3f9ba5c03f2f2f8cf4512b39-Paper.pdf
  9. G. E. Mullins, P. G. Stankiewicz, R. C. Hawthorne, and S. K. Gupta, “Adaptive generation of challenging scenarios for testing and evaluation of autonomous vehicles,” Journal of Systems and Software, vol. 137, pp. 197–215, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0164121217302546
  10. A. Sinha, M. O’Kelly, R. Tedrake, and J. C. Duchi, “Neural bridge sampling for evaluating safety-critical autonomous systems,” Advances in Neural Information Processing Systems, vol. 33, pp. 6402–6416, 2020.
  11. A. Corso and M. J. Kochenderfer, “Interpretable safety validation for autonomous vehicles,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), 2020, pp. 1–6.
  12. L. C. Das and M. Won, “Saint-acc: Safety-aware intelligent adaptive cruise control for autonomous vehicles using deep reinforcement learning,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139.   PMLR, 18–24 Jul 2021, pp. 2445–2455. [Online]. Available: https://proceedings.mlr.press/v139/das21a.html
  13. A. Alizadeh, M. Moghadam, Y. Bicer, N. K. Ure, U. Yavas, and C. Kurtulus, “Automated lane change decision making using deep reinforcement learning in dynamic and uncertain highway environment,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019, pp. 1399–1404.
  14. M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical review E, vol. 62, no. 2, p. 1805, 2000.
  15. X. Li, A. Rakotonirainy, and X. Yan, “How do drivers avoid collisions? a driving simulator-based study,” Journal of Safety Research, vol. 70, pp. 89–96, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022437518307333
  16. C. E. Rasmussen, “Gaussian processes in machine learning,” in Summer school on machine learning.   Springer, 2003, pp. 63–71.
  17. J. B. Mockus and L. J. Mockus, “Bayesian approach to global optimization and application to multiobjective and constrained problems,” Journal of optimization theory and applications, vol. 70, no. 1, pp. 157–172, 1991.
  18. F. Cérou and A. Guyader, “Adaptive multilevel splitting for rare event analysis,” Stochastic Analysis and Applications, vol. 25, no. 2, pp. 417–443, 2007. [Online]. Available: https://doi.org/10.1080/07362990601139628
  19. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” CoRR, vol. abs/1707.06347, 2017. [Online]. Available: http://arxiv.org/abs/1707.06347
  20. E. Leurent, “An environment for autonomous driving decision-making,” https://github.com/eleurent/highway-env, 2018.
  21. E. Liang, R. Liaw, R. Nishihara, P. Moritz, R. Fox, J. Gonzalez, K. Goldberg, and I. Stoica, “Ray rllib: A composable and scalable reinforcement learning library,” CoRR, vol. abs/1712.09381, 2017. [Online]. Available: http://arxiv.org/abs/1712.09381

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com