Deep reinforcement learning-based longitudinal control strategy for automated vehicles at signalised intersections (2505.08896v1)

Published 13 May 2025 in cs.AI and cs.RO

Abstract: Developing an autonomous vehicle control strategy for signalised intersections (SI) is one of the challenging tasks due to its inherently complex decision-making process. This study proposes a Deep Reinforcement Learning (DRL) based longitudinal vehicle control strategy at SI. A comprehensive reward function has been formulated with a particular focus on (i) distance headway-based efficiency reward, (ii) decision-making criteria during amber light, and (iii) asymmetric acceleration/ deceleration response, along with the traditional safety and comfort criteria. This reward function has been incorporated with two popular DRL algorithms, Deep Deterministic Policy Gradient (DDPG) and Soft-Actor Critic (SAC), which can handle the continuous action space of acceleration/deceleration. The proposed models have been trained on the combination of real-world leader vehicle (LV) trajectories and simulated trajectories generated using the Ornstein-Uhlenbeck (OU) process. The overall performance of the proposed models has been tested using Cumulative Distribution Function (CDF) plots and compared with the real-world trajectory data. The results show that the RL models successfully maintain lower distance headway (i.e., higher efficiency) and jerk compared to human-driven vehicles without compromising safety. Further, to assess the robustness of the proposed models, we evaluated the model performance on diverse safety-critical scenarios, in terms of car-following and traffic signal compliance. Both DDPG and SAC models successfully handled the critical scenarios, while the DDPG model showed smoother action profiles compared to the SAC model. Overall, the results confirm that DRL-based longitudinal vehicle control strategy at SI can help to improve traffic safety, efficiency, and comfort.

Summary

Deep Reinforcement Learning for Automating Vehicle Control at Signalized Intersections

The task of developing a robust control strategy for automated vehicles (AVs) at signalized intersections (SI) is fraught with complexities due to the intricate decision-making required. The present paper by Kumar et al. introduces a novel longitudinal vehicle control mechanism leveraging Deep Reinforcement Learning (DRL) algorithms, specifically the Deep Deterministic Policy Gradient (DDPG) and Soft Actor-Critic (SAC) methods, to navigate these intersections effectively. The paper focuses on a multi-component reward function that aims to balance efficiency, safety, and comfort, extending previous work by adding features such as amber light decision-making and an asymmetric acceleration/deceleration response.

Core Contributions

The paper contributes significantly to the field by:

Designing a distance headway-based efficiency reward that accommodates the stop-and-go nature of driving at SIs, addressing limitations of previous time-headway approaches.
Proposing a complete decision-making mechanism for navigating amber light phases at traffic signals, thereby improving existing DRL models which predominantly focus on red and green signal compliance.
Incorporating asymmetric response behaviors in the reward structure, which traditional CF models have documented, further aligning AV control strategies with human-centric driving characteristics.

Methodological Implementation

The framework formulated exploits DRL to traverse the continuous action space of vehicle acceleration and deceleration, using real-world data from the pNEUMA dataset and simulated trajectories based on the Ornstein-Uhlenbeck (OU) process. The authors deftly integrate a combination of practical and theoretical constructs: a multi-faceted reward function encompassing safety through Time-To-Collision metrics, traffic signal compliance, efficiency through desired distance headway, driving comfort via jerk minimization, and asymmetric action penalties for realistic driving replication. This intricate design pushes the DRL models beyond mere mimicry of human drivers towards optimizing AV behavior under various intersection scenarios.

Results and Interpretations

The empirical evaluation, grounded in real observation data, reveals that the proposed DRL models not only maintain lower distance headways, achieving higher throughput, but also achieve reductions in jerk compared to human-driven scenarios, indicating improvements in comfort levels. Particularly, the DDPG model yields smoother action profiles, suggesting its potential superiority in delivering a more consistent and reliable AV performance in intersection contexts. The robust handling of diverse safety-critical scenarios by both DDPG and SAC models attests to their practical applicability.

Practical and Theoretical Implications

From a practical standpoint, the integration of DRL in AV design offers promising avenues for enhancing traffic flow efficiency and safety across signalized intersections, potentially reducing congestion and mitigating collisions. Theoretically, this paper underscores the versatility and potential of DRL-based approaches to abstract complex driving behaviors, providing a groundwork for future expansions into multi-agent environments, more nuanced traffic interactions, and integrated mobility systems.

Future Directions

Looking forward, the paper opens pathways for incorporating additional metrics such as fuel efficiency for a holistic eco-driving strategy. Expanding datasets and ensuring real-world adaptability through larger-scaled testing across varied geographical locations and differing traffic laws might further refine and validate the models presented. Moreover, leveraging advanced simulations marrying DRL strategies with real-time adjustments could elevate AV capabilities to unprecedented precision in managing complex urban traffic networks.

In closing, this research sheds valuable light on the intricacies of AV control strategies at SIs, establishing a robust, adaptable framework that harmonizes competing driving attributes into an optimally functioning system. Through meticulous methodological design, the paper heralds an era where DRL applications promise substantial upgrades in the field of automated driving.