DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars (1708.08559v2)

Published 28 Aug 2017 in cs.SE, cs.AI, and cs.LG

Abstract: Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most major manufacturers including Tesla, GM, Ford, BMW, and Waymo/Google are working on building and testing different types of autonomous vehicles. The lawmakers of several US states including California, Texas, and New York have passed new legislation to fast-track the process of testing and deployment of autonomous vehicles on their roads. However, despite their spectacular progress, DNNs, just like traditional software, often demonstrate incorrect or unexpected corner case behaviors that can lead to potentially fatal collisions. Several such real-world accidents involving autonomous cars have already happened including one which resulted in a fatality. Most existing testing techniques for DNN-driven vehicles are heavily dependent on the manual collection of test data under different driving conditions which become prohibitively expensive as the number of test conditions increases. In this paper, we design, implement and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explores different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.

Authors (4)

Yuchi Tian (5 papers)
Kexin Pei (20 papers)
Suman Jana (50 papers)
Baishakhi Ray (88 papers)

Citations (1,298)

View on Semantic Scholar

Summary

The paper presents DeepTest, an automated testing framework for uncovering erroneous behaviors in DNN-driven autonomous cars.
It employs neuron coverage metrics and synthetic image transformations, including brightness, contrast, and weather effects, to simulate diverse driving conditions.
Experimental results on top models reveal that DeepTest identifies thousands of corner-case errors, highlighting the system's potential to improve AI safety for autonomous vehicles.

Automated Testing of Autonomous Cars via Deep Neural Networks

The paper "DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars" introduces DeepTest, a systematic approach to identifying erroneous behaviors in deep neural network (DNN)-based systems for autonomous driving. The authors present multiple facets of the work: the development and implementation of DeepTest, the empirical evaluation of its effectiveness, and discussions on implications for future research in AI safety critical systems.

Overview

Recent advancements in DNNs have accelerated the progress in autonomous vehicles, enabling them to navigate real-world environments using diverse sensor data. Despite these advances, DNN-driven systems are prone to unexpected errors, particularly in rare corner cases, leading to severe safety concerns. Traditional methods for testing DNNs in autonomous driving largely depend on manual collection of test data under various driving conditions—a practice both costly and limited in scope. This paper addresses these challenges by automating the testing process and expanding coverage using neuron activation techniques.

Methodology

DeepTest employs a novel testing methodology based on neuron coverage, a metric correlated with the variety and diversity of the DNN's operational logic explored by the test inputs. The core idea is to generate synthetic images by applying real-world transformations such as adjustments in brightness, contrast, and the addition of rain or fog. These transformations aim to simulate genuine driving conditions, thereby allowing the exploration of different DNN behaviors.

Additionally, DeepTest incorporates a greedy search technique to systematically combine transformations and maximize neuron coverage. The ultimate goal is to rigorously stress-test the DNN logic, uncovering potential areas where the model may fail.

Results

Experimentations were conducted on three highly ranked models from the Udacity self-driving car challenge: Rambo, Chauffeur, and Epoch. Synthetic transformations consistently led to significant increases in neuron coverage, compared to baseline inputs. Moreover, the combined transformations yielded further neuron activation, demonstrating efficacy in probing deeper into the DNN’s logic.

Using metamorphic relations, specifically designed to validate the consistency of outputs under similar inputs with controlled perturbations, DeepTest successfully identified thousands of erroneous behaviors. The statistical significance of these findings illustrates that systematic neuron coverage-guided testing can reveal a variety of critical corner cases that are likely to result in real-world errors.

Implications and Future Work

The implications of this research are twofold. On a practical level, DeepTest provides a framework to automatically generate diverse and realistic test cases, significantly reducing the dependency on manually curated datasets. This methodological shift could lead to more robust and reliable autonomous driving systems, ultimately enhancing road safety. On a theoretical front, the concept of neuron coverage presents a transferable approach for testing other DNN applications beyond self-driving cars.

Future research could explore further optimizations in test generation algorithms, the incorporation of more complex and varied transformations, and the enhancement of metamorphic relations to capture a broader spectrum of errors. Additionally, integrating DeepTest with other forms of automated testing and verification could provide a more comprehensive safety net for DNN-based systems.

Conclusion

DeepTest represents a methodological advancement in the automatic testing of autonomous vehicles driven by DNNs. By leveraging realistic transformations and coverage-guided test generation, it offers a scalable solution to uncovering potentially harmful corner cases, paving the way for safer autonomous driving technologies. The successful detection of numerous erroneous behaviors underscores the importance of systematic and automated testing strategies in the field of AI safety.

PDF Markdown

Related Papers

YouTube

Show All Videos