Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions (2402.04168v2)

Published 6 Feb 2024 in cs.LG, cs.CV, and cs.RO

Abstract: Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source. We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions. Our method is applicable to arbitrary RL models. We successfully demonstrate high completion rates of complex scenarios with recent model-based agents.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a novel informed RL framework that integrates a hierarchical rulebook to guide reward mechanisms in handling traffic rule exceptions.
It employs a Frenet frame-based trajectory planning method within a POMDP to tackle real-world dynamic traffic irregularities.
Experimental results in CARLA demonstrate significant improvements in navigational performance with higher arrived distances and finished scores.

Introduction

Reinforcement Learning (RL) has made significant strides in the domain of autonomous driving, but its application in learning direct trajectories for navigation, especially under complex traffic scenarios that require exception handling of hierarchical traffic rules, remains a challenge. This paper introduces an innovative approach that enhances the decision-making capabilities of autonomous vehicles in situations where traffic rules can be flexibly applied. Titled "Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions," the paper integrates a structured rulebook to inform RL agent reward mechanisms, resulting in improved trajectory planning and execution in anomalous traffic conditions.

Related Work

The literature review confirms that while RL has been successfully applied in standard traffic scenarios, there has been a deficit in addressing the complexity of real-world traffic, particularly the nuanced application of traffic rule exceptions. Existing methods lack an incorporation of structured, hierarchical traffic rules, leading to a gap in the capability of autonomous vehicles to operate effectively under such circumstances. Furthermore, the paper emphasizes the limited exploration of reward functions in autonomous driving that do not encompass the potential to address challenges identified.

Methodology

To bridge the research gap, the authors propose an approach based on "Informed Reinforcement Learning." This involves the generation of vehicle trajectories using the Frenet frame - a method that considers the vehicle’s dynamics as often unknown, modelled as a Partially Observable Markov Decision Process (POMDP). Central to their method is a situation-aware reward function, which takes advantage of a formal rulebook that reflects the dynamic prioritization among traffic rules. Rule realizations are employed to grade trajectory compliance to traffic laws, and the reward function is adjusted according to the hierarchical importance of the rules presently influencing the vehicle’s situation. Through a structured reward design, agents can understand when to appropriately execute controlled exceptions.

Experimental Results

The researchers put their approach to the test under 1000 anomaly scenarios within the CARLA simulation environment. The models explored include both model-based (DreamerV3) and model-free (Rainbow) agents, extended with the novel trajectory generation and rulebook-based reward mechanisms. The paper reports impressive performance metrics, significantly outperforming baselines in terms of "Arrived Distance" and "Finished Score," denoting the agent's ability to follow a lane and complete navigational tasks within rule exception scenarios. The combination of the trajectory planning extension and situation-aware reward function accelerated learning and improved overall performance.

Conclusion

This paper provides a compelling account of the incorporation of structured, machine-comprehensible traffic rules within the RL framework, showing a marked improvement in handling unusual traffic situations where standard rules are not sufficient. By learning to navigate scenarios that necessitate traffic rule exceptions solely based on raw sensory observations, and not just pre-processed or structured data, the work opens pathways for more adaptable and real-world-applicable autonomous driving technology. While the paper has its limitations, such as reliance on ground-truth for situation-awareness activation, it represents a significant step forward in the operational flexibility of autonomous vehicles. The authors encourage future exploration into continuous action space trajectory generation and independent modules for situation awareness that could potentially lead to even broader real-world applications.

PDF Markdown

Related Papers

Tweets

https://twitter.com/OWW/status/1801411518572990602