Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 90 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 24 tok/s
GPT-5 High 27 tok/s Pro
GPT-4o 100 tok/s
GPT OSS 120B 478 tok/s Pro
Kimi K2 217 tok/s Pro
2000 character limit reached

Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization (2003.01303v1)

Published 3 Mar 2020 in cs.LG, cs.AI, and stat.ML

Abstract: Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two predominant problems: behaviours are unexplainable, and they cannot guarantee safety under new scenarios. This paper presents a safe RL algorithm, called Parallel Constrained Policy Optimization (PCPO), for two autonomous driving tasks. PCPO extends today's common actor-critic architecture to a three-component learning framework, in which three neural networks are used to approximate the policy function, value function and a newly added risk function, respectively. Meanwhile, a trust region constraint is added to allow large update steps without breaking the monotonic improvement condition. To ensure the feasibility of safety constrained problems, synchronized parallel learners are employed to explore different state spaces, which accelerates learning and policy-update. The simulations of two scenarios for autonomous vehicles confirm we can ensure safety while achieving fast learning.

Citations (60)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.