Traffic Smoothing Controllers for Autonomous Vehicles Using Deep Reinforcement Learning and Real-World Trajectory Data

Published 18 Jan 2024 in eess.SY, cs.AI, cs.MA, and cs.SY | (2401.09666v1)

Abstract: Designing traffic-smoothing cruise controllers that can be deployed onto autonomous vehicles is a key step towards improving traffic flow, reducing congestion, and enhancing fuel efficiency in mixed autonomy traffic. We bypass the common issue of having to carefully fine-tune a large traffic microsimulator by leveraging real-world trajectory data from the I-24 highway in Tennessee, replayed in a one-lane simulation. Using standard deep reinforcement learning methods, we train energy-reducing wave-smoothing policies. As an input to the agent, we observe the speed and distance of only the vehicle in front, which are local states readily available on most recent vehicles, as well as non-local observations about the downstream state of the traffic. We show that at a low 4% autonomous vehicle penetration rate, we achieve significant fuel savings of over 15% on trajectories exhibiting many stop-and-go waves. Finally, we analyze the smoothing effect of the controllers and demonstrate robustness to adding lane-changing into the simulation as well as the removal of downstream information.

Abstract PDF HTML Upgrade to Chat

Authors (6)

References (22)

Citations (3)

View on Semantic Scholar

Summary

The paper demonstrates that RL controllers trained with real trajectory data can reduce fuel consumption by over 15% in stop-and-go traffic scenarios.
It formulates traffic control as a partially-observed Markov decision process and employs a PPO algorithm for training autonomous vehicle cruise controllers.
The approach shows robustness in varied traffic conditions, offering promising implications for real-world deployments of autonomous vehicles.

Traffic Smoothing Controllers for Autonomous Vehicles Utilizing Deep Reinforcement Learning

The paper under discussion presents a methodology for improving traffic efficiency and fuel economy through the implementation of traffic-smoothing cruise controllers in autonomous vehicles (AVs). Leveraging deep reinforcement learning (RL) techniques alongside real-world trajectory data from the I-24 highway, the authors offer promising outcomes in handling stop-and-go traffic waves, which contribute significantly to energy inefficiency in transportation systems.

Core Contributions and Findings

The authors circumvent the difficulties typically associated with setting up extensive traffic micro-simulators by using recorded trajectory data in a singular lane simulation environment. Through this innovative approach, they develop RL-powered cruise control policies designed to mitigate energy consumption during traffic congestion scenarios. A notable achievement from the research is demonstrating over 15% fuel savings at a low 4% AV penetration rate in real-world scenarios characterized by numerous stop-and-go waves. These numerical results underscore the capability of a minor proportion of controlled AVs to modify traffic flow for enhanced fuel efficiency.

Another intriguing aspect of the study is the RL controller's resilience in different traffic dynamics, including lane changes and in scenarios where access to downstream traffic information is not available. The robustness of the controller indicates potential applicability in an array of real-life traffic conditions, providing a foundational step toward the deployment of such technologies in existing transportation systems.

Methodology and Technical Features

The research is rooted in formulating the traffic control problem as a partially-observed Markov decision process (POMDP). The authors meticulously define the observation and action space, incorporating familiar variables such as speed and distance but extending it with downstream traffic state information. Such an extension assists the RL algorithm in anticipating traffic conditions, thereby refining the control policy for better energy outcomes.

A distinctive methodological innovation is the development of a simulation environment that replays real-world highway trajectories, affording an expedited yet realistic training process compared to extensive microsimulations. Key to this method's success is calibrating the vehicle telemetry and lane-changing models to approximate real-world conditions precisely.

Moreover, the authors employ a Proximal Policy Optimization (PPO) algorithm to train the AV controllers. The network architecture utilized includes four hidden layers, which, combined with a few specific hyperparameters tailored for this application, enables effective learning from the available trajectory data.

Theoretical and Practical Implications

From a theoretical standpoint, the paper contributes significantly to the growing body of literature exploring RL applications in traffic management. It affirms that data-driven models can substantially impact transportation energy metrics when integrated with RL methodologies. Practically, these findings hold substantial promise for reducing carbon emissions and augmenting traffic fluidity through controlled assimilation of AVs into existing vehicular systems.

As AV penetration on public roads is expected to escalate, the study provides a pertinent exploration into achieving traffic management objectives using minimal computational resources while maintaining operational accuracy and safety. The paper highlights the potential for deployment in real-world scenarios through structured experimentation and validation processes.

Future Prospects

The exploration introduces several avenues for future research and development. The interdisciplinary nature of the problem suggests opportunities for enhancing the integration of RL techniques with advanced vehicular technology, including further refinement of vehicle telemetry and behavior models. Multi-agent reinforcement learning and cooperative strategies amongst vehicle fleets present additional paths for elevating system-level efficiencies.

While the simulation outcomes are compelling, the translation of these results into large-scale real-world implementations will necessitate comprehensive testing and perhaps innovations in vehicular communication systems to facilitate the new control policies.

In conclusion, the study provides a rigorous examination of how reinforcement learning controllers can be seamlessly integrated with existing vehicle technology to create significant improvements in traffic flow and fuel economy. The rigorous approach, coupled with strong numerical evidence, establishes a solid foundation for ongoing and future explorations within this domain.

Markdown Report Issue