- The paper introduces a novel deep Q-learning framework that integrates partial vehicle detection with DTSE and a TSD reward function to optimize signal phases and cut delays.
- Simulation results across various intersection setups demonstrate significant efficiency gains over traditional methods like Max Pressure and SOTL.
- The study underscores the promise of adaptive traffic management with connected vehicles, achieving near-optimal performance even at moderate CV penetration rates.
Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection
Introduction
Traffic congestion remains a formidable challenge, with deep implications on economic activities and urban life quality due to increased travel times, fuel consumption, and environmental degradation. Traditional fixed-time traffic signals often fail to adapt to dynamically changing traffic patterns, necessitating advanced solutions such as adaptive traffic signal control (ATSC) systems. Presently, Deep Q-Learning (DQN) provides a versatile framework for developing adaptive strategies, albeit under idealistic assumptions of complete vehicle detection. This paper addresses the realistic condition of partial detection, exploiting emergent communication technologies to enhance traffic flow management at single intersections.
Deep Q-Learning Model and Architecture
The proposed model leverages a DQN algorithm tailored for traffic signal control in partially observable environments populated with connected vehicles (CVs). Within this framework, the authors introduce Partial Detection Traffic State Encoding (DTSE) and a novel reward function, Total Squared Delay (TSD), combined with a specialized CNN architecture for processing state representations.
Agent Actions and State Representation
The DQN agent undertakes decision-making within a grid consisting of possible traffic signal phases, determining both phase selection and green interval durations. The model pivots on partial DTSE, an intricate matrix encoding the presence, speed, and lane signals for CVs over discretized road segments. This microscopic approach allows for informed phase toggling decisions even when complete visibility isn't guaranteed.
Figure 1: Partial DTSE visualization illustrates layered information integration: CV positions, speeds, and lane traffic signals.
The nuanced state representation equips the DQN with the necessary understanding to respond swiftly to fluctuating traffic demands, ensuring optimal phase durations, thereby reducing congestion.
Reward Function
The reward mechanism centers on minimizing vehicle delays through TSD, prioritizing equitable phase allocations over brute force throughput maximization. The squared delay term encourages an efficient distribution of phase intervals over short-duration delays, fostering a holistic approach that aligns with real-world economics of intersection throughput.
Methodology
The model's efficacy was tested in SUMO through simulations across diverse intersection configurations and traffic signal setups. Scenarios ranged from straightforward 2-phase programs to complex 4-phase structures, reflecting varying traffic inflow densities and CV penetration rates. The performance benchmark involved comparing DQN implementations against traditional algorithms like Max Pressure and SOTL across multiple episodes.
Results
The findings affirm the DQN model's superiority, particularly in 4-phase setups, where it demonstrates robustness against diverse traffic conditions. The comparative analysis emphasizes notable improvements in efficiency over actuated controllers, showcasing less delay variance across scenarios.
Figure 2: Probability distributions of EMTDs for DQN and actuated controllers reveal DQN's concentrated delay values.
Figure 3: EMTD statistics underscore DQN's consistent performance edge over Max Pressure and SOTL.
In partial detection tests, the DQN retained near-optimal performance at CV penetration rates upwards of 20%, with diminishing differences observed as penetration approached 40%.
Conclusion
This research unequivocally establishes the DQN model with partial DTSE as a viable solution for ATSC systems in environments with incomplete vehicle detection. It not only achieves remarkable balances between fairness and efficiency but optimizes performance even at moderate CV integration levels, promising substantial cutbacks on lost travel times.
The work sets a precedent for future advancements, including refined state representations for broader applicability, collaborative multi-agent architectures, and real-world scenario integrations. The insights gathered promise a shift from reactive traffic management to proactive, intelligence-driven systems, catalyzing smarter urban mobility.