Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection

Published 29 Sep 2021 in cs.LG, eess.SY, and math.OC | (2109.14337v1)

Abstract: Intelligent traffic signal controllers, applying DQN algorithms to traffic light policy optimization, efficiently reduce traffic congestion by adjusting traffic signals to real-time traffic. Most propositions in the literature however consider that all vehicles at the intersection are detected, an unrealistic scenario. Recently, new wireless communication technologies have enabled cost-efficient detection of connected vehicles by infrastructures. With only a small fraction of the total fleet currently equipped, methods able to perform under low detection rates are desirable. In this paper, we propose a deep reinforcement Q-learning model to optimize traffic signal control at an isolated intersection, in a partially observable environment with connected vehicles. First, we present the novel DQN model within the RL framework. We introduce a new state representation for partially observable environments and a new reward function for traffic signal control, and provide a network architecture and tuned hyper-parameters. Second, we evaluate the performances of the model in numerical simulations on multiple scenarios, in two steps. At first in full detection against existing actuated controllers, then in partial detection with loss estimates for proportions of connected vehicles. Finally, from the obtained results, we define thresholds for detection rates with acceptable and optimal performance levels.

Citations (11)

Summary

  • The paper introduces a novel deep Q-learning framework that integrates partial vehicle detection with DTSE and a TSD reward function to optimize signal phases and cut delays.
  • Simulation results across various intersection setups demonstrate significant efficiency gains over traditional methods like Max Pressure and SOTL.
  • The study underscores the promise of adaptive traffic management with connected vehicles, achieving near-optimal performance even at moderate CV penetration rates.

Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection

Introduction

Traffic congestion remains a formidable challenge, with deep implications on economic activities and urban life quality due to increased travel times, fuel consumption, and environmental degradation. Traditional fixed-time traffic signals often fail to adapt to dynamically changing traffic patterns, necessitating advanced solutions such as adaptive traffic signal control (ATSC) systems. Presently, Deep Q-Learning (DQN) provides a versatile framework for developing adaptive strategies, albeit under idealistic assumptions of complete vehicle detection. This paper addresses the realistic condition of partial detection, exploiting emergent communication technologies to enhance traffic flow management at single intersections.

Deep Q-Learning Model and Architecture

The proposed model leverages a DQN algorithm tailored for traffic signal control in partially observable environments populated with connected vehicles (CVs). Within this framework, the authors introduce Partial Detection Traffic State Encoding (DTSE) and a novel reward function, Total Squared Delay (TSD), combined with a specialized CNN architecture for processing state representations.

Agent Actions and State Representation

The DQN agent undertakes decision-making within a grid consisting of possible traffic signal phases, determining both phase selection and green interval durations. The model pivots on partial DTSE, an intricate matrix encoding the presence, speed, and lane signals for CVs over discretized road segments. This microscopic approach allows for informed phase toggling decisions even when complete visibility isn't guaranteed. Figure 1

Figure 1: Partial DTSE visualization illustrates layered information integration: CV positions, speeds, and lane traffic signals.

The nuanced state representation equips the DQN with the necessary understanding to respond swiftly to fluctuating traffic demands, ensuring optimal phase durations, thereby reducing congestion.

Reward Function

The reward mechanism centers on minimizing vehicle delays through TSD, prioritizing equitable phase allocations over brute force throughput maximization. The squared delay term encourages an efficient distribution of phase intervals over short-duration delays, fostering a holistic approach that aligns with real-world economics of intersection throughput.

Methodology

The model's efficacy was tested in SUMO through simulations across diverse intersection configurations and traffic signal setups. Scenarios ranged from straightforward 2-phase programs to complex 4-phase structures, reflecting varying traffic inflow densities and CV penetration rates. The performance benchmark involved comparing DQN implementations against traditional algorithms like Max Pressure and SOTL across multiple episodes.

Results

The findings affirm the DQN model's superiority, particularly in 4-phase setups, where it demonstrates robustness against diverse traffic conditions. The comparative analysis emphasizes notable improvements in efficiency over actuated controllers, showcasing less delay variance across scenarios. Figure 2

Figure 2: Probability distributions of EMTDs for DQN and actuated controllers reveal DQN's concentrated delay values.

Figure 3

Figure 3: EMTD statistics underscore DQN's consistent performance edge over Max Pressure and SOTL.

In partial detection tests, the DQN retained near-optimal performance at CV penetration rates upwards of 20%, with diminishing differences observed as penetration approached 40%.

Conclusion

This research unequivocally establishes the DQN model with partial DTSE as a viable solution for ATSC systems in environments with incomplete vehicle detection. It not only achieves remarkable balances between fairness and efficiency but optimizes performance even at moderate CV integration levels, promising substantial cutbacks on lost travel times.

The work sets a precedent for future advancements, including refined state representations for broader applicability, collaborative multi-agent architectures, and real-world scenario integrations. The insights gathered promise a shift from reactive traffic management to proactive, intelligence-driven systems, catalyzing smarter urban mobility.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.