Agile Autonomous Driving using End-to-End Deep Imitation Learning (1709.07174v6)

Published 21 Sep 2017 in cs.RO

Abstract: We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost sensors. By imitating a model predictive controller equipped with advanced sensors, we train a deep neural network control policy to map raw, high-dimensional observations to continuous steering and throttle commands. Compared with recent approaches to similar tasks, our method requires neither state estimation nor on-the-fly planning to navigate the vehicle. Our approach relies on, and experimentally validates, recent imitation learning theory. Empirically, we show that policies trained with online imitation learning overcome well-known challenges related to covariate shift and generalize better than policies trained with batch imitation learning. Built on these insights, our autonomous driving system demonstrates successful high-speed off-road driving, matching the state-of-the-art performance.

Citations (54)

View on Semantic Scholar

Summary

The paper introduces a novel end-to-end imitation learning system that maps raw sensory data to continuous control commands.
It reduces dependency on expensive sensors by leveraging expert demonstrations from an MPC-based controller.
Online imitation learning with DAgger enhances robustness against covariate shift, enabling high-speed off-road maneuvers.

Agile Autonomous Driving using End-to-End Deep Imitation Learning

In this paper, the authors present a novel end-to-end imitation learning system designed for agile, high-speed off-road autonomous driving, leveraging only low-cost onboard sensors. The system, a departure from traditional methods dependent on extensive sensor suites and on-the-fly planning, adopts a deep neural network-based control policy to facilitate mapping from raw, high-dimensional observational data to continuous steering and throttle commands.

Key Contributions

The paper makes several significant contributions to the domain of autonomous driving:

Reduced Dependency on High-Cost Sensors: By relying on imitation learning from a model predictive controller (MPC) equipped with advanced sensors, the proposed system negates the need for expensive GPS or IMU sensors in the learned control policy.
Effective Handling of Covariate Shift: Demonstrating the advantage of online imitation learning over batch learning, the research highlights the former's ability to counteract issues related to covariate shift—a known challenge in imitation learning where the distribution of training data differs from the deployment environment.
Empirical Validation of End-to-End Learning: The system is shown to perform high-speed off-road driving effectively, achieving speeds up to 8 m/s, competitive with state-of-the-art methods, thereby empirically validating recent theoretical advances in imitation learning.

Methodology Overview

The proposed system utilizes a dual-layered approach. An MPC, which operates with superior state estimation and planning capabilities, serves as the expert providing demonstrations. The learning agent, in contrast, utilizes a deep neural network (DNN) designed to ingest and process raw sensory data from a simple monocular camera and wheel speed sensors.

Imitation Learning Approach

Building on recent advancement in imitation learning theory, the authors employ two approaches:

Batch Imitation Learning: Employs pre-recorded expert demonstrations to train the DNN offline.
Online Imitation Learning with DAgger: Continually refines the policy by mixing expert and learner decisions, addressing potential covariate shifts more effectively than batch learning.

Results

Empirical evaluations demonstrate that while both learning paradigms can ultimately achieve high speeds, online imitation learning results in better generalization capabilities and robustness against distributional shifts, as evidenced by successful high-speed maneuvers and policy performance.

Discussion on Implications and Future Directions

Practically, this methodology offers a more cost-effective solution for robust autonomous driving on less-structured and stochastic terrains, potentially broadening the accessibility and deployment of autonomous ground systems.

From a theoretical perspective, the work exemplifies the integration of imitation learning with deep neural networks for decision-making in continuous action spaces, an area ripe for exploration. Future research might explore integrating additional sensory modalities or models to further improve the system’s resilience and adaptability in even more dynamic settings.

As autonomous systems continue to evolve, the need for efficient learning techniques that minimize resource dependency will pave the way for broader application across various domains. This paper contributes significantly to these efforts, providing insights and techniques that other researchers and practitioners can build upon.

PDF Markdown

Related Papers

YouTube

Show All Videos