ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst (1812.03079v1)

Published 7 Dec 2018 in cs.RO, cs.CV, and cs.LG

Abstract: Our goal is to train a policy for autonomous driving via imitation learning that is robust enough to drive a real vehicle. We find that standard behavior cloning is insufficient for handling complex driving scenarios, even when we leverage a perception system for preprocessing the input and a controller for executing the output on the car: 30 million examples are still not enough. We propose exposing the learner to synthesized data in the form of perturbations to the expert's driving, which creates interesting situations such as collisions and/or going off the road. Rather than purely imitating all data, we augment the imitation loss with additional losses that penalize undesirable events and encourage progress -- the perturbations then provide an important signal for these losses and lead to robustness of the learned model. We show that the ChauffeurNet model can handle complex situations in simulation, and present ablation experiments that emphasize the importance of each of our proposed changes and show that the model is responding to the appropriate causal factors. Finally, we demonstrate the model driving a car in the real world.

Authors (3)

Mayank Bansal (6 papers)
Alex Krizhevsky (4 papers)
Abhijit Ogale (4 papers)

Citations (697)

View on Semantic Scholar

Summary

The paper demonstrates that augmenting imitation learning with synthesized perturbations and additional loss functions significantly improves driving model robustness.
It employs a recurrent neural network using mid-level input representations from 30 million real-world examples to navigate complex driving scenarios.
Closed-loop simulations reveal that the M{4} model variant effectively avoids collisions and corrects lane deviations in challenging environments.

ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst

"ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst" by Mayank Bansal, Alex Krizhevsky, and Abhijit Ogale proposes an advanced model for autonomous driving through imitation learning. The central goal is to develop a robust driving policy that is capable of navigating a real vehicle in complex scenarios where conventional behavior cloning may fall short.

Model and Training Approach

The foundation of the proposed system lies in leveraging extensive real-world data—30 million examples—while adopting mid-level input and output representations. Instead of relying on raw sensory data, which can increase sample complexity, the authors employ preprocessed top-down views of the driving environment, incorporating relevant features such as vehicles, traffic lights, and planned routes. These inputs are fed into a recurrent neural network (RNN) named ChauffeurNet, which produces a driving trajectory subsequently converted to control commands by an external controller.

Standard behavior cloning is found inadequate, especially due to the accumulating errors in a closed-loop system. Thus, the authors introduce synthesized perturbations in the form of deviations from the expert driving trajectory. These perturbations expose the model to scenarios involving collisions or veering off-road, which are then mitigated by incorporating additional losses in the training process. Augmenting the imitation loss with penalties for undesirable behaviors and incentives for progress aids in improving the robustness of the learned model.

Experimental Design and Results

Various modeling configurations and their impact are evaluated through a series of ablation tests and simulations:

Baseline Model (M{0}): Only imitation learning with past motion dropout to prevent overfitting on sequential dependencies in past data.
Perturbed Model (M{1}): Incorporates trajectory perturbations without modifying the loss structure.
Enhanced Model (M{2}): Adds environment-centered penalties (collision, off-road deviations) to the perturbed model.
Reweighted Model (M{3}): Reweights imitation losses while maintaining environment losses.
Imitation Dropout Model (M{4}): Utilizes imitation losses dropout probabilistically to reinforce model robustness.

Closed-loop simulation results demonstrate the superiority of the M{4} model, effectively negotiating complex scenarios like nudging around parked vehicles and recovering from lane deviations. Specifically, M{4} exhibits a notable ability to avoid collisions and maintain desired routes, outperforming models without the synthesized perturbations and additional loss functions.

Practical Implications and Future Directions

ChauffeurNet successfully transfers to real-world driving, as evidenced by the model’s capability to handle various maneuvers like lane following, stopping for traffic controls, and negotiation with other agents. However, challenges such as handling high-speed maneuvers, intricate road geometries, and complex interactive scenarios highlight the need for further advancements.

The prospect of integrating reinforcement learning (RL) within this framework is particularly compelling. Combining imitation learning with RL could provide a more comprehensive exploration of edge-case scenarios, significantly improving the behavior in highly interactive driving situations. Additionally, the potential for end-to-end optimization from sensor inputs to control outputs remains an open and intriguing avenue for future research.

In summary, ChauffeurNet represents a significant stride in autonomous driving, showcasing how augmenting expert demonstrations with synthesized perturbations and specialized training losses can lead to robust and reliable driving models. Despite current limitations, this approach lays foundational work for more advanced machine learning-driven driving systems, with promising implications for the future of autonomous vehicle technology.

Related Papers

YouTube

Show All Videos