Human-Like Autonomous Car-Following Model with Deep Reinforcement Learning (1901.00569v1)

Published 3 Jan 2019 in cs.LG, cs.AI, and stat.ML

Abstract: This study proposes a framework for human-like autonomous car-following planning based on deep reinforcement learning (deep RL). Historical driving data are fed into a simulation environment where an RL agent learns from trial and error interactions based on a reward function that signals how much the agent deviates from the empirical data. Through these interactions, an optimal policy, or car-following model that maps in a human-like way from speed, relative speed between a lead and following vehicle, and inter-vehicle spacing to acceleration of a following vehicle is finally obtained. The model can be continuously updated when more data are fed in. Two thousand car-following periods extracted from the 2015 Shanghai Naturalistic Driving Study were used to train the model and compare its performance with that of traditional and recent data-driven car-following models. As shown by this study results, a deep deterministic policy gradient car-following model that uses disparity between simulated and observed speed as the reward function and considers a reaction delay of 1s, denoted as DDPGvRT, can reproduce human-like car-following behavior with higher accuracy than traditional and recent data-driven car-following models. Specifically, the DDPGvRT model has a spacing validation error of 18% and speed validation error of 5%, which are less than those of other models, including the intelligent driver model, models based on locally weighted regression, and conventional neural network-based models. Moreover, the DDPGvRT demonstrates good capability of generalization to various driving situations and can adapt to different drivers by continuously learning. This study demonstrates that reinforcement learning methodology can offer insight into driver behavior and can contribute to the development of human-like autonomous driving algorithms and traffic-flow models.

PDF Abstract

Deep Reinforcement Learning for Human-Like Autonomous Car-Following

The paper presents a novel car-following model for autonomous vehicles, employing a deep reinforcement learning (RL) framework, specifically the Deep Deterministic Policy Gradient (DDPG) algorithm. The model aims to emulate human-like driving behaviors during car-following scenarios, thus enabling better interaction between autonomous and human-driven vehicles. This aligns with the transitional phases of road ecosystems where autonomous vehicles will increasingly coexist with human drivers.

Core Methodology

The proposed model leverages historical driving data fed into a simulation environment, allowing a reinforcement learning agent to iteratively learn and adjust its policy based on a reward function. The reward function is designed to minimize deviations from real-world driving behavior. The DDPG algorithm, an off-policy actor-critic algorithm tailored for continuous action spaces, serves as the backbone for learning these car-following strategies. The model identifies inter-vehicle spacings and speeds as critical input variables, processed through neural networks representing the actor and critic within the DDPG framework.

Experimental Data and Training

This research utilizes data from the 2015 Shanghai Naturalistic Driving Study, which provides a rich corpus of car-following events. A substantial training dataset comprising 2,000 car-following periods was used to train the model. During training, these periods were subjected to an iterative learning process, where the DDPG model continuously refined its policy to align with observed human-like driving patterns.

Results and Comparisons

The DDPG model with velocity-based reward (DDPGvRT) and a consideration for reaction time (RT) demonstrated superior performance over traditional car-following models, such as the Intelligent Driver Model (IDM), and other data-driven approaches, including recurrent neural networks (RNN) and locally weighted regression (Loess). The DDPGvRT model achieved a spacing validation error of 18% and a speed validation error of 5%, outperforming other models. The paper highlights DDPG's enhanced trajectory-reproducing accuracy and superior generalization capabilities across different driving styles, as evidenced by its lower inter-driver and intra-driver validation errors.

Implications and Future Work

The strong numerical results indicate that using deep RL methodologies can significantly contribute to the development of human-like autonomous driving systems. The model's ability to adapt to different driving styles underscores the potential for creating versatile autonomous systems capable of coexisting with diverse human drivers safely. However, the incorporation of safety constraints and handling potential human errors in training data warrants further development.

Future studies could explore prioritizing experience replay to highlight critical learning phases and compare DDPG models based on structured inputs against end-to-end models that utilize raw sensor data. Such advancements could further unravel the latent structures between human perception and autonomous control.

In conclusion, this paper illustrates a promising application of deep reinforcement learning for autonomous car-following, offering substantial evidence of its effectiveness in mimicking human-like driving behavior. This research paves the way for more nuanced and flexible traffic systems that can dynamically integrate autonomous vehicles into existing road networks.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Meixin Zhu (39 papers)
Xuesong Wang (43 papers)
Yinhai Wang (45 papers)

Citations (383)

View on Semantic Scholar