Control of a Mixed Autonomy Signalised Urban Intersection: An Action-Delayed Reinforcement Learning Approach

Published 24 Jun 2021 in eess.SY | (2106.12755v1)

Abstract: We consider a mixed autonomy scenario where the traffic intersection controller decides whether the traffic light will be green or red at each lane for multiple traffic-light blocks. The objective of the traffic intersection controller is to minimize the queue length at each lane and maximize the outflow of vehicles over each block. We consider that the traffic intersection controller informs the autonomous vehicle (AV) whether the traffic light will be green or red for the future traffic-light block. Thus, the AV can adapt its dynamics by solving an optimal control problem. We model the decision process of the traffic intersection controller as a deterministic delay Markov decision process owing to the delayed action by the traffic controller. We propose Reinforcement-learning based algorithm to obtain the optimal policy. We show - empirically - that our algorithm converges and reduces the energy costs of AVs drastically as the traffic controller communicates with the AVs.

Abstract PDF Upgrade to Chat

Authors (4)

Citations (3)

View on Semantic Scholar

Summary

The paper proposes an action-delayed RL method that models traffic signal control as a deterministic delay Markov decision process (DDMDP) in mixed autonomy settings.
The methodology integrates AV communications with optimal control strategies to actively reduce queue lengths and energy usage at urban intersections.
Simulation results show that the RL controller minimizes waiting times and improves energy efficiency compared to conventional traffic management approaches.

"Control of a Mixed Autonomy Signalised Urban Intersection"

Introduction

This paper addresses the control of a signalised urban intersection in a mixed autonomy scenario, where both human-driven vehicles (HDVs) and autonomous vehicles (AVs) coexist. The focus is on efficiently managing the traffic lights to minimize queue lengths and energy consumption while maximizing vehicle throughput. By modelling the intersection controller's decision process as a deterministic delay Markov decision process (DDMDP), an action-delayed reinforcement learning (RL) approach is proposed to optimize traffic signal control. AVs benefit from communication with the controller, allowing them to adapt their dynamics based on anticipated green or red signals.

System Model

The system consists of a four-lane urban intersection divided into three main zones: the Merging Zone (MZ), Control Zone (CZ), and Exiting Zone (EZ). The traffic light controller determines green or red lights for each traffic-block over multiple horizons, informing AVs about future signal states. This enables AVs to adjust their dynamics by solving an optimal control problem.

AV and HDV Dynamics

HDV dynamics are governed by the Intelligent Driver Model (IDM), which ensures safety in car-following scenarios and reacts to traffic signals. For AVs, the paper proposes an optimal control strategy which minimizes energy costs, subject to constraints such as velocity limits and timing for intersection crossing (Figure 1). AVs navigate based on RL-informed signal changes, reducing fuel consumption and improving throughput.

Reinforcement Learning Approach

The RL framework characterizes the decision-making process as a DDMDP, factoring in the delayed implementation of traffic light decisions. The state set includes queue lengths, while the action space dictates the signal status. The reward function, critical to RL optimization, measures queue length reduction between time intervals. A tabular Q-Learning algorithm is used to iteratively improve the controller's policy, ensuring optimal lane accessibility and congestion management.

Implementation

Simulation results demonstrate the RL controller's efficacy in both mixed and exclusive HDV scenarios. The average waiting time and cumulative reward are calculated over simulation periods, revealing minimal differences between scenarios, yet highlighting the energy efficiency of AVs. Tabulated statistical indices for the integral of squared acceleration indicate substantial benefits for AVs compared to HDVs, showcasing superior energy consumption rates and dynamics adaptation efficiency.

Figure 1: Traffic light controlled intersection in a mixed autonomy scenario

Conclusion

The proposed RL-based algorithm effectively models and optimizes traffic signal control in mixed autonomy intersections by incorporating AV dynamics. AVs demonstrate reduced energy consumption, validating the communication advantage with the controller. Future research will explore reward customizations, controller communication reliability, and optimization parameters, aiming at further enhancing intersection management.

In conclusion, this paper presents a sophisticated approach to intersection control that leverages modern RL techniques and advanced vehicle dynamics, notable for its capability to integrate future autonomous technology with existing human-driven systems.

Markdown Report Issue