Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks (1803.11115v1)

Published 29 Mar 2018 in cs.LG, cs.AI, and stat.ML

Abstract: Existing inefficient traffic light control causes numerous problems, such as long delay and waste of energy. To improve efficiency, taking real-time traffic information as an input and dynamically adjusting the traffic light duration accordingly is a must. In terms of how to dynamically adjust traffic signals' duration, existing works either split the traffic signal into equal duration or extract limited traffic information from the real data. In this paper, we study how to decide the traffic signals' duration based on the collected data from different sensors and vehicular networks. We propose a deep reinforcement learning model to control the traffic light. In the model, we quantify the complex traffic scenario as states by collecting data and dividing the whole intersection into small grids. The timing changes of a traffic light are the actions, which are modeled as a high-dimension Markov decision process. The reward is the cumulative waiting time difference between two cycles. To solve the model, a convolutional neural network is employed to map the states to rewards. The proposed model is composed of several components to improve the performance, such as dueling network, target network, double Q-learning network, and prioritized experience replay. We evaluate our model via simulation in the Simulation of Urban MObility (SUMO) in a vehicular network, and the simulation results show the efficiency of our model in controlling traffic lights.

Citations (367)

View on Semantic Scholar

Summary

The paper’s main contribution is a novel DRL model that optimizes traffic light timings using real-time sensor data to significantly reduce vehicle waiting times.
The authors employ a high-dimensional MDP framework with a CNN-based Q-value approximation, integrating techniques like double Q-learning and prioritized experience replay.
Empirical results from SUMO simulations show a reduction of 25.7% to 26.7% in vehicle waiting times under normal and rush hour conditions.

Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks

The paper "Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks" offers a compelling exploration of how intelligent traffic systems can be enhanced through the use of deep reinforcement learning (DRL). The authors tackle the classical problem of traffic signal timing, which, if optimized, can significantly alleviate urban congestion, reduce energy consumption, and minimize delays.

Approach and Methodology

The authors propose a novel DRL model designed to dynamically control traffic light duration by leveraging real-time data collected from sensor-equipped vehicles that form a vehicular network. The state representation in their model is constructed using a grid that segments an intersection into small parts, capturing the positions and velocities of the vehicles. Actions are defined as changes in the traffic light timings, framed as a high-dimension Markov decision process (MDP).

A convolutional neural network (CNN) is utilized to approximate the state-action values, or Q-values, associating observed states with expected rewards, thereby guiding the traffic control policy towards actions that minimize cumulative waiting times at intersections. The model extensively leverages state-of-the-art techniques in reinforcement learning, including a dueling network architecture, target networks, double Q-learning, and prioritized experience replay to enhance learning efficiency and policy performance.

Results and Implications

The evaluation of the proposed system is conducted using a traffic micro-simulator, SUMO (Simulation of Urban MObility), where the DRL model is tested under varying traffic conditions, including uniform and rush hour scenarios. The empirical results illustrate a marked improvement in both cumulative reward and average waiting time, compared to traditional pre-programmed traffic light systems. The model not only learns to efficiently manage standard traffic scenarios but also adapts to congestion patterns typical of peak hours, achieving approximately 25.7% reduction in average vehicle waiting times under normal traffic flow conditions and 26.7% reduction during rush hours.

Theoretical and Practical Contributions

From a theoretical perspective, the paper makes significant contributions to the application of DRL in dynamic and complex environments such as traffic management systems. By modeling the problem in a high-dimensional MDP framework and addressing it with robust and adaptive learning techniques, the authors highlight the efficacy of DRL in real-world applications where optimal actions are contingent on a persistent influx of sensory data.

Practically, the implementation of such a model has the potential to transform urban intersection management by replacing human intervention during peak times with autonomous operations. This could lead to smarter cities where traffic lights autonomously adapt to live traffic conditions, significantly mitigating congestion and its adverse effects.

Future Directions

The research posits several future exploration avenues. Firstly, the integration of additional real-world factors such as pedestrian movements or weather conditions, which can further complicate traffic control, offers exciting possibilities for extending the model's capabilities. Secondly, expanding the model to networked intersections could provide insights into scaled implementations across urban centers. Lastly, advances in vehicular communication strategies may allow for more sophisticated data-driven adaptive systems, enhancing the DRL model's performance further.

In conclusion, leveraging DRL for traffic light control presents an intriguing and viable solution to urban transport challenges. With continued development, such systems could redefine how intersections are managed, improving efficiency and sustainability of urban transport networks.

PDF Markdown