Adversarial Examples for Semantic Segmentation and Object Detection
Introduction
Adversarial examples—images that have been subtly altered to mislead deep learning models—are a well-discussed vulnerability in image classification. This paper extends the paper of adversarial examples to more complex tasks: semantic segmentation and object detection. By examining these tasks, which require classifying multiple targets within an image, the paper introduces a novel algorithm called Dense Adversary Generation (DAG). This algorithm crafts adversarial examples by optimizing a loss function over a set of pixels/proposals for targets such as receptive fields in segmentation or object proposals in detection.
Algorithm Overview
DAG applies an innovative approach where each target, defined as pixels in segmentation and proposals in detection, undergoes adversarial perturbation. The optimization involves specifying adversarial labels for these targets and iteratively adjusting the inputs to minimize the classification gap between true and adversarial labels. The primary steps include:
- Identifying a large set of targets for segmentation and a dense set of proposals for detection.
- Specifying a perturbation for each target to minimize the overall loss function.
- Iteratively refining perturbations to ensure misclassification across all specified targets.
Experimental Results and Analysis
Performance Impact
The efficacy of DAG is highlighted through experiments showing significant drops in performance metrics on widely-used datasets:
- Semantic Segmentation: Utilizing FCN-based networks, applying adversarial perturbations reduced mean Intersection over Union (mIOU) from, e.g., 65.49% to 4.09% in one FCN-VGG variant.
- Object Detection: For Faster-RCNN models, mean Average Precision (mAP) dropped from 69.14% to 5.92% in one VGG-based model.
Notably, it was observed that networks with more training data, such as those trained on combined PascalVOC datasets, exhibited larger drops in performance.
Transferability of Perturbations
A key contribution of this paper is the examination of the transferability of adversarial perturbations. The paper is segmented into three parts:
- Cross-Training Transfer: Perturbations from one model effectively caused accuracy drops in other models of the same architecture but trained on different datasets. For instance, perturbations from a model trained on the PascalVOC-2007 dataset significantly impacted models trained on the combined PascalVOC-2007/2012 dataset.
- Cross-Network Transfer: Perturbations retained some effectiveness across different network architectures, but the degradation in performance was less pronounced. For example, transferring perturbations between ZFNet and VGGNet-based models resulted in minor accuracy drops.
- Cross-Task Transfer: Perturbations generated for segmentation tasks could impact detection networks and vice versa, particularly when networks shared similar architectures.
Moreover, combining perturbations from multiple sources yielded robust black-box attacks, achieving significant drops in performance even when network structures were not known a priori.
Implications and Future Work
The insights provided by DAG and the extensive evaluation of adversarial transferability suggest multiple avenues for future research:
- Robustness in Model Training: The demonstrated transferability of adversarial examples implies that enhancing the robustness of deep learning models will require strategies that generalize beyond specific architectures or training datasets.
- Universal Adversarial Perturbations: Exploring methods to generate more generalized perturbations that might impact an even broader spectrum of models and tasks.
- Adversarial Defense Mechanisms: Developing and testing defensive measures, such as adversarial training and perturbation detection, in the context of advanced tasks like semantic segmentation and object detection.
Conclusion
This paper contributes valuable knowledge to the field of adversarial machine learning by extending the scope of adversarial examples to complex visual tasks. The Dense Adversary Generation algorithm effectively creates adversarial examples that impair state-of-the-art models' performance in semantic segmentation and object detection. Furthermore, the transferability studies highlight vulnerabilities that necessitate comprehensive defense mechanisms in future research.