Field Deployment of Multi-Agent Reinforcement Learning Based Variable Speed Limit Controllers (2407.08021v1)

Published 10 Jul 2024 in cs.MA

Abstract: This article presents the first field deployment of a multi-agent reinforcement-learning (MARL) based variable speed limit (VSL) control system on the I-24 freeway near Nashville, Tennessee. We describe how we train MARL agents in a traffic simulator and directly deploy the simulation-based policy on a 17-mile stretch of Interstate 24 with 67 VSL controllers. We use invalid action masking and several safety guards to ensure the posted speed limits satisfy the real-world constraints from the traffic management center and the Tennessee Department of Transportation. Since the time of launch of the system through April, 2024, the system has made approximately 10,000,000 decisions on 8,000,000 trips. The analysis of the controller shows that the MARL policy takes control for up to 98% of the time without intervention from safety guards. The time-space diagrams of traffic speed and control commands illustrate how the algorithm behaves during rush hour. Finally, we quantify the domain mismatch between the simulation and real-world data and demonstrate the robustness of the MARL policy to this mismatch.

Summary

The paper presents the first field deployment of a MARL-based VSL system, showcasing its practical effectiveness in real-world adaptive traffic control.
It employs a cooperative MARL framework with invalid action masking and safety guards to address the challenges of transitioning from simulation to live deployment.
The system maintained control for 81.3% to 87.3% of daily operations, proving robustness and laying the groundwork for future AI-driven traffic management solutions.

Field Deployment of Multi-Agent Reinforcement Learning Based Variable Speed Limit Controllers

This paper presents a detailed account of the first field deployment of a Multi-Agent Reinforcement Learning (MARL) based Variable Speed Limit (VSL) control system on a 17-mile segment of Interstate 24 (I-24) near Nashville, Tennessee. This advanced traffic management system, consisting of 67 VSL controllers, was developed and tested using a microscopic traffic simulator before being implemented in the real-world setting. The system employs invalid action masking and safety guards to ensure compliance with real-world traffic management constraints.

Background and Motivation

Variable speed limit (VSL) systems have been instrumental in managing road traffic by dynamically altering speed limits in response to real-time traffic conditions, with the goal of reducing congestion and accidents. Traditionally, most VSL systems have been rule-based, often limited by their inability to adapt to unforeseen traffic scenarios or optimize for multiple conflicting objectives. Reinforcement Learning (RL), and more specifically Multi-Agent Reinforcement Learning (MARL), presents a promising approach to surmount these limitations by learning and adapting from real-time interactions with the environment.

Previous studies have demonstrated the potential of RL in simulated traffic environments, but the transition to real-world applications remains largely unexplored. This knowledge gap is significant, given the added complexities and unpredictable nature of real-world traffic conditions. The deployment described in this paper bridges that gap, offering valuable insights into the practical application of MARL-based VSL systems.

Methodology

Problem Formulation and Simulation Training

The VSL control system was formulated as a cooperative MARL problem, where each VSL controller was represented as an agent in a Markov Game framework. The state space for each agent included local traffic speeds and occupancies, and the action space consisted of permissible speed limits. The reward function combined three terms: adaptability, safety, and mobility, to promote a smooth transition between traffic regimes.

Training was conducted using the TransModeler microscopic simulation software, with low driver compliance rates to mirror expected real-world conditions. The MARL policy was trained on a 7-mile stretch of I-24 and subsequently tested over a 17-mile segment to ensure its robustness and scalability.

Real-World Deployment and Constraints

The real-world deployment involved several safety guards and constraints to ensure compliance with traffic management regulations:

Invalid Action Masking (IAM): Ensures that speed limit differentials between consecutive VSL controllers do not exceed 10 mph.
Speed Matching: Aligns posted speed limits with actual traffic speeds to improve motorist compliance.
Maximum Speed Limit Correction: Ensures posted speed limits adhere to segment-specific maximum limits.
Debounce Constraint: Prevents speeds from forming a non-monotonic pattern across adjacent VSL controllers.

The deployment was implemented through the Artificial Intelligence Decision Support System (AI-DSS), which interfaced with the Traffic Management Center (TMC) software to control the VSL gantries in real-time.

Results and Analysis

The deployment of the MARL-based VSL system has been operational since March 8, 2024. The system has made approximately 10 million decisions affecting over 8 million trips. The MARL policy, with IAM, controlled the speed limits for about 81.3% of the time on I-24 Westbound and 87.3% on I-24 Eastbound daily, with a slightly reduced effectiveness during peak hours.

A detailed analysis revealed that the policy was effective in identifying congestion and free-flow regimes, as well as generating smooth speed transitions in transitional regimes. Safety guards were required less than 20% of the time, demonstrating the robustness of the MARL policy.

A quantification of the domain mismatch between simulation and real-world observations, using the Wasserstein Distance, indicated a discrepancy that the MARL policy handled effectively, maintaining control for most of the operational time and thus validating its robustness in real-world conditions.

Implications and Future Directions

The successful deployment of this MARL-based VSL system showcases the practical applicability of advanced RL algorithms in real-world traffic management systems. This deployment offers a framework for future implementations of RL in infrastructure management, potentially extended to other traffic control tasks such as ramp metering or signal timing optimization.

Future work will involve continuous monitoring and adaptive tuning of the MARL policy based on real-world data, potentially incorporating more sophisticated safety guards and exploring higher compliance rates. The findings from this ongoing deployment will contribute to the broader adoption of AI-driven traffic management solutions, paving the way for enhanced traffic safety and efficiency.

By demonstrating the effectiveness and scalability of MARL in a complex, real-world environment, this work sets a foundation for future research and deployment in the burgeoning field of intelligent transportation systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/EugeneVinitsky/status/1813587860857266638

https://twitter.com/michael_nielsen/status/1813611304000876751

YouTube

Show All Videos

HackerNews

Multi-Agent Reinforcement Learning Based Variable Speed Limit Controllers (3 points, 0 comments)