End-to-End Model-Free Reinforcement Learning for Urban Driving using Implicit Affordances (1911.10868v2)

Published 25 Nov 2019 in cs.LG, cs.AI, cs.CV, cs.RO, and stat.ML

Abstract: Reinforcement Learning (RL) aims at learning an optimal behavior policy from its own experiments and not rule-based control methods. However, there is no RL algorithm yet capable of handling a task as difficult as urban driving. We present a novel technique, coined implicit affordances, to effectively leverage RL for urban driving thus including lane keeping, pedestrians and vehicles avoidance, and traffic light detection. To our knowledge we are the first to present a successful RL agent handling such a complex task especially regarding the traffic light detection. Furthermore, we have demonstrated the effectiveness of our method by winning the Camera Only track of the CARLA challenge.

Authors (3)

Marin Toromanoff (8 papers)
Emilie Wirbel (11 papers)
Fabien Moutarde (35 papers)

Citations (184)

View on Semantic Scholar

Summary

Implicit Affordances for Urban Driving via Model-Free Reinforcement Learning

The paper "End-to-End Model-Free Reinforcement Learning for Urban Driving using Implicit Affordances" presents a novel methodology aimed at addressing the challenges of using reinforcement learning (RL) in the highly intricate domain of urban driving. Key aspects such as lane keeping, pedestrian and vehicle avoidance, and traffic light detection make urban driving a formidable task for RL algorithms, which are not traditionally equipped to handle such complexity.

Core Contributions

The paper's primary contribution is the introduction of a technique termed "implicit affordances." The authors demonstrate the first application of RL to effectively manage urban environments, including the crucial aspect of traffic light detection. This advance is underscored by the success of their RL agent in the "Camera Only" track of the CARLA Autonomous Driving Challenge.

Methodology

The significant innovation of implicit affordances involves a two-phase training process. Initially, an encoder backbone network (Resnet-18) is trained to predict high-level semantic information, including traffic light states and lane positions. These predictive features, termed implicit affordances, are then employed as the RL state input, which circumvents the need for directly storing raw images in the replay memory. This choice results in substantially reduced memory requirements, as evidenced by memory usage dropping to approximately one-twentieth of previous needs.

The decision to leverage pre-trained affordances seems justified: while direct RL training requires considerable data, an affordance-based approach allows RL algorithms to focus on policy optimization with an already nuanced understanding of critical environmental features.

Evaluation and Performance Metrics

The efficacy of this approach is demonstrated through rigorous testing in the CARLA simulator, a widely recognized benchmark for autonomous driving research, which encompasses a diverse set of urban driving scenarios. The models are evaluated on their ability to navigate urban environments safely, efficiently, and in compliance with traffic rules.

Experiments, including ablation studies, reveal that models trained with affordances achieve superior performance compared to those trained without. Key metrics such as intersections successfully crossed, compliance with traffic lights, and avoiding pedestrian collisions provided insights into the contributions of individual affordance losses.

Implications and Future Directions

This work illustrates a promising direction for RL in autonomous driving, where integrating high-level semantic predictions can significantly enhance data efficiency and task performance. Beyond refining these specific methods, future work might explore the incorporation of additional affordances and extend this framework to actor-critic or policy-based paradigms.

Moreover, the long-term potential includes adaptation to real-world driving, where affordances could derive from live sensor data, thus bridging the gap between simulated and real-world environments.

In conclusion, this paper underscores the imperative for innovative design strategies to handle complex tasks in RL, particularly in the domain of autonomous urban driving. The success of the implicit affordances strategy marks a critical advancement in both theoretical understanding and practical application, aligning RL closer to real-world, high-stakes domains.