Physical Adversarial Examples for Object Detectors (1807.07769v2)

Published 20 Jul 2018 in cs.CR, cs.CV, and cs.LG

Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples-maliciously crafted inputs that cause DNNs to make incorrect predictions. Recent work has shown that these attacks generalize to the physical domain, to create perturbations on physical objects that fool image classifiers under a variety of real-world conditions. Such attacks pose a risk to deep learning models used in safety-critical cyber-physical systems. In this work, we extend physical attacks to more challenging object detection models, a broader class of deep learning algorithms widely used to detect and label multiple objects within a scene. Improving upon a previous physical attack on image classifiers, we create perturbed physical objects that are either ignored or mislabeled by object detection models. We implement a Disappearance Attack, in which we cause a Stop sign to "disappear" according to the detector-either by covering thesign with an adversarial Stop sign poster, or by adding adversarial stickers onto the sign. In a video recorded in a controlled lab environment, the state-of-the-art YOLOv2 detector failed to recognize these adversarial Stop signs in over 85% of the video frames. In an outdoor experiment, YOLO was fooled by the poster and sticker attacks in 72.5% and 63.5% of the video frames respectively. We also use Faster R-CNN, a different object detection model, to demonstrate the transferability of our adversarial perturbations. The created poster perturbation is able to fool Faster R-CNN in 85.9% of the video frames in a controlled lab environment, and 40.2% of the video frames in an outdoor environment. Finally, we present preliminary results with a new Creation Attack, where in innocuous physical stickers fool a model into detecting nonexistent objects.

PDF Abstract

Physical Adversarial Examples for Object Detectors

The paper, "Physical Adversarial Examples for Object Detectors," examines the vulnerabilities of deep neural networks (DNNs) to adversarial attacks, specifically within the domain of object detection models. This research builds upon previous work that identified the susceptibility of image classifiers to adversarial inputs, extending the investigation to object detectors—a more complex class of algorithms that detect and label multiple objects in a scene concurrently. Object detection systems such as YOLO v2 and Faster R-CNN are crucial in applications like autonomous driving, making the evaluation of their robustness under adversarial conditions especially pertinent.

The paper introduces two types of adversarial attacks: the Disappearance Attack and the Creation Attack. The former targets making objects invisible to detection systems, while the latter involves inducing the detection of objects that are not present. The Disappearance Attack was tested using two adversarial methods: a poster perturbation and a sticker perturbation applied to real Stop signs. In controlled indoor environments, YOLO v2 failed to detect these adversarially-modified Stop signs in over 85% of the video frames, demonstrating the effectiveness of the attacks under stable conditions. This success rate was somewhat reduced outdoors, where the fluctuating environmental conditions resulted in detection failure rates of 72.5% and 63.5% for the poster and sticker perturbations, respectively.

To ensure the comprehensiveness of the adversarial approach, the authors extended the Robust Physical Perturbations (RP2) algorithm to include synthetic transformations such as object rotation and positional changes. This enhancement accommodates the variable conditions under which object detectors typically operate, thus broadening the applicability of the attack strategy. Furthermore, the research highlights variations in attack success between different detection models by evaluating attack transferability to the Faster R-CNN detector. The results indicated notable, albeit reduced, success rates, affirming some level of attack generalization.

Importantly, the paper considers the potential real-world ramifications of these physical adversarial examples. Although the scope of the implementation described is limited to experimental scenarios, the implications for safety-critical systems like autonomous vehicles are undeniable. The ability to obfuscate stop signs raises significant concerns about traffic safety and system integrity in malicious attack scenarios.

This research denotes a substantial contribution to the discourse on AI safety and security, highlighting both current vulnerabilities and areas for further exploration. Future work should focus on generalization to other physical environments, potentially incorporating moving vehicles, which would necessitate more dynamic attack adaptations. Investigating alternative classes of adversarial attacks, such as those targeting object segmentation networks, also poses intriguing possibilities. Furthermore, an understanding of how such physical adversarial examples can be mitigated or neutralized within integrated systems remains a critical challenge.

Overall, "Physical Adversarial Examples for Object Detectors" not only demonstrates the susceptibility of current object detection models to adversarial manipulation but also underscores the critical need for continued research into enhancing the robustness and security of DNNs against such vulnerabilities. The findings provoke further inquiry into the security adaptations necessary to withstand adversarial threats in increasingly complex AI-driven environments.

PDF Markdown Bookmark Chat (Pro)

Authors (9)

Kevin Eykholt (16 papers)
Ivan Evtimov (24 papers)
Earlence Fernandes (23 papers)
Bo Li (1107 papers)
Amir Rahmati (17 papers)
Atul Prakash (36 papers)
Tadayoshi Kohno (32 papers)
Dawn Song (229 papers)
Florian Tramer (19 papers)

Citations (434)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos