Safe, Efficient, and Comfortable Velocity Control based on Reinforcement Learning for Autonomous Driving (1902.00089v2)

Published 29 Jan 2019 in cs.LG, cs.AI, cs.RO, and stat.ML

Abstract: A model used for velocity control during car following was proposed based on deep reinforcement learning (RL). To fulfil the multi-objectives of car following, a reward function reflecting driving safety, efficiency, and comfort was constructed. With the reward function, the RL agent learns to control vehicle speed in a fashion that maximizes cumulative rewards, through trials and errors in the simulation environment. A total of 1,341 car-following events extracted from the Next Generation Simulation (NGSIM) dataset were used to train the model. Car-following behavior produced by the model were compared with that observed in the empirical NGSIM data, to demonstrate the model's ability to follow a lead vehicle safely, efficiently, and comfortably. Results show that the model demonstrates the capability of safe, efficient, and comfortable velocity control in that it 1) has small percentages (8\%) of dangerous minimum time to collision values (\textless\ 5s) than human drivers in the NGSIM data (35\%); 2) can maintain efficient and safe headways in the range of 1s to 2s; and 3) can follow the lead vehicle comfortably with smooth acceleration. The results indicate that reinforcement learning methods could contribute to the development of autonomous driving systems.

Authors (6)

Meixin Zhu (39 papers)
Yinhai Wang (45 papers)
Ziyuan Pu (27 papers)
Jingyun Hu (3 papers)
Xuesong Wang (44 papers)
Ruimin Ke (16 papers)

Citations (289)

View on Semantic Scholar

Summary

Overview of Safe, Efficient, and Comfortable Velocity Control with Reinforcement Learning for Autonomous Driving

This paper presents a velocity control model for autonomous car-following based on deep reinforcement learning (RL), specifically utilizing the Deep Deterministic Policy Gradient (DDPG) algorithm. The model aims to meet dual objectives: close imitation of human driving behavior and optimization in terms of safety, efficiency, and comfort by developing a reward function that encompasses these elements. The proposed method is trained and evaluated using car-following scenarios derived from the NGSIM dataset, consisting of 1,341 distinct events, providing a robust dataset for learning and validation.

Core Contributions

Reinforcement Learning Framework: The paper employs a deep RL framework leveraging DDPG, capable of handling continuous action spaces common in vehicle acceleration control tasks. The methodology introduces actor-critic architectures within neural networks to learn optimal policies for velocity adjustment based on real-time driving states comprising vehicle speed, relative speed, and spacing.
Reward Function Design: The reward function is a pivotal component in promoting driving behaviors aligned with safety, efficiency, and comfort. This function incorporates parameters such as Time to Collision (TTC) for safety, headway for efficiency, and jerk for comfort, thereby guiding the RL model to achieve desired behaviors.
Comparative Analysis: The car-following behaviors generated by the model were compared against human-driver data from NGSIM. Notably, the model demonstrated superior safety performance with only 8% of simulated events showing a minimum TTC below 5 seconds, compared to 35% of human-driven events. Additionally, it produced more stable and comfortable trajectories with reduced instances of high jerk values.

Numerical Results and Insights

Safety: The DDPG model significantly reduces risky car-following scenarios. The percentage reduction in low TTC events highlights the potential for such models to enhance traffic safety within autonomous driving contexts.
Efficiency: By maintaining headways more consistently within an optimal 1 to 2-second range, the model showcases the ability to balance safety with traffic throughput.
Comfort: The model's focus on reducing jerk illustrates a direct consideration of passenger comfort, a often subjective yet critical component in vehicle control systems.

Implications and Future Directions

The introduction of a model targeting safety, efficiency, and comfort in autonomous driving presents both practical and theoretical innovations. Practically, the model could be deployed in autonomous vehicle platforms to improve navigation in dynamic, real-world driving conditions. Theoretically, this work lays a foundation for incorporating broader objectives like energy efficiency and adaptive driving styles based on user preferences.

There are several prospects for extending this paper. Future work could explore more sophisticated reward functions, potentially involving nonlinear formulations to capture complex dependencies between driving metrics. Moreover, integrating additional training mechanisms such as prioritized experience replay might enhance learning efficiency by emphasizing high-value experiences within the simulation environment.

In conclusion, the integration of reinforcement learning into velocity control for car-following tasks marks a significant step toward fully autonomous, human-like driving capabilities. The presented model's performance supports the assertion that RL can foster improved safety, efficiency, and comfort in autonomous driving systems, promoting better acceptance and implementation in society at large.

PDF Markdown

Related Papers

Find Related Papers