Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

Published 19 Jun 2024 in cs.RO | (2406.13223v2)

Abstract: Autonomous vehicles inevitably encounter a vast array of scenarios in real-world environments. Addressing long-tail scenarios, particularly those involving intensive interactions with numerous traffic participants, remains one of the most significant challenges in achieving high-level autonomous driving. Reinforcement learning (RL) offers a promising solution for such scenarios and allows autonomous vehicles to continuously self-evolve during interactions. However, traditional RL often requires trial and error from scratch in new scenarios, resulting in inefficient exploration of unknown states. Integrating RL with planning-based methods can significantly accelerate the learning process. Additionally, conventional RL methods lack robust safety mechanisms, making agents prone to collisions in dynamic environments in pursuit of short-term rewards. Many existing safe RL methods depend on environment modeling to identify reliable safety boundaries for constraining agent behavior. However, explicit environmental models can fail to capture the complexity of dynamic environments comprehensively. Inspired by the observation that human drivers rarely take risks in uncertain situations, this study introduces the concept of action timing and proposes a timing-aware RL method, In this approach, a "timing imagination" process previews the execution results of the agent's strategies at different time scales. The optimal execution timing is then projected to each decision moment, generating a dynamic safety factor to constrain actions. A planning-based method serves as a conservative baseline strategy in uncertain states. In two representative interaction scenarios, an unsignalized intersection and a roundabout, the proposed model outperforms the benchmark models in driving safety.