Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control (1903.04527v1)

Published 11 Mar 2019 in cs.LG and stat.ML

Abstract: Reinforcement learning (RL) is a promising data-driven approach for adaptive traffic signal control (ATSC) in complex urban traffic networks, and deep neural networks further enhance its learning power. However, centralized RL is infeasible for large-scale ATSC due to the extremely high dimension of the joint action space. Multi-agent RL (MARL) overcomes the scalability issue by distributing the global control to each local RL agent, but it introduces new challenges: now the environment becomes partially observable from the viewpoint of each local agent due to limited communication among agents. Most existing studies in MARL focus on designing efficient communication and coordination among traditional Q-learning agents. This paper presents, for the first time, a fully scalable and decentralized MARL algorithm for the state-of-the-art deep RL agent: advantage actor critic (A2C), within the context of ATSC. In particular, two methods are proposed to stabilize the learning procedure, by improving the observability and reducing the learning difficulty of each local agent. The proposed multi-agent A2C is compared against independent A2C and independent Q-learning algorithms, in both a large synthetic traffic grid and a large real-world traffic network of Monaco city, under simulated peak-hour traffic dynamics. Results demonstrate its optimality, robustness, and sample efficiency over other state-of-the-art decentralized MARL algorithms.

PDF Abstract

Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control

The paper presents a novel approach to adaptive traffic signal control (ATSC) utilizing multi-agent deep reinforcement learning (RL), specifically focusing on the advantage actor-critic (A2C) model. This paper addresses the scalability limitations inherent in centralized RL systems by employing a decentralized multi-agent framework. The aim is to efficiently manage large-scale urban traffic networks, overcoming the high-dimensional joint action space that arises with increased network size.

Central Contribution

The core contribution of this research is the development of a scalable, decentralized multi-agent reinforcement learning (MARL) algorithm designed to address the complexities involved in ATSC. Two primary enhancements are made to stabilize learning:

Improved Observability: Leveraging neighboring agents' observations and fingerprints enhances each local agent's understanding of regional traffic conditions and sharing strategies among agents.
Spatial Discount Factor: Introducing a spatial discount factor scales down the influence of distant agents, allowing each local agent to focus on its proximate environment.

These innovations are particularly significant as they facilitate the deployment of RL in dynamic, real-world environments where full observability and communication are impractical.

Experimental Evaluation

The effectiveness of the proposed multi-agent A2C (MA2C) is compared against independent A2C and Q-learning in both synthetic and real-world traffic networks. Notable findings include:

Synthetic Traffic Grid: MA2C demonstrates superior performance in queue length reduction and intersection delay, indicating enhanced adaptability and sustainability compared to other models.
Monaco City Network: In a more complex, real-world scenario, MA2C maintains lower intersection delays and sustains optimal traffic flow across varying density levels, outperforming traditional RL models and a simple greedy policy.

Numerical Results

The numerical results highlight MA2C's robustness and optimality:

MA2C achieved lower average queue lengths and intersection delays than independent A2C (IA2C) and IQL models, ensuring better traffic management.
The proposed method's sample efficiency and scalability were evident in both experimental setups, with MA2C maintaining optimal traffic flow even under high simulation complexity.

Implications and Future Work

The practical implications of this research are substantial, offering a feasible solution for real-time traffic management in urban environments. Theoretically, the model advances the use of MARL in partially observable settings, extending the application of reinforcement learning in complex networked systems.

Future developments could focus on the enhancement of model robustness against real-world sensor noise and exploring the impact of varying communication protocols among agents. Additionally, integrating real-time adaptive features to accommodate unexpected traffic pattern shifts could further extend the utility of the proposed approach.

In conclusion, this paper contributes significantly to the field of intelligent transportation systems by leveraging advanced RL techniques to enhance the efficiency and effectiveness of traffic signal control in large-scale urban networks. This research serves as a foundational step towards more adaptive and intelligent infrastructure management solutions in smart cities.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Tianshu Chu (15 papers)
Jie Wang (480 papers)
Lara Codecà (1 paper)
Zhaojian Li (60 papers)

Citations (595)

View on Semantic Scholar

Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control (1903.04527v1)