Adversarial Examples Improve Image Recognition (1911.09665v2)

Published 21 Nov 2019 in cs.CV

Abstract: Adversarial examples are commonly viewed as a threat to ConvNets. Here we present an opposite perspective: adversarial examples can be used to improve image recognition models if harnessed in the right manner. We propose AdvProp, an enhanced adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting. Key to our method is the usage of a separate auxiliary batch norm for adversarial examples, as they have different underlying distributions to normal examples. We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger. For instance, by applying AdvProp to the latest EfficientNet-B7 [28] on ImageNet, we achieve significant improvements on ImageNet (+0.7%), ImageNet-C (+6.5%), ImageNet-A (+7.0%), Stylized-ImageNet (+4.8%). With an enhanced EfficientNet-B8, our method achieves the state-of-the-art 85.5% ImageNet top-1 accuracy without extra data. This result even surpasses the best model in [20] which is trained with 3.5B Instagram images (~3000X more than ImageNet) and ~9.4X more parameters. Models are available at https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.

Citations (542)

View on Semantic Scholar

Summary

The paper introduces AdvProp, a method leveraging adversarial examples with separate batch normalization layers to significantly enhance image recognition performance.
The approach achieves considerable top-1 accuracy gains on EfficientNet models across datasets like ImageNet, ImageNet-C, ImageNet-A, and Stylized-ImageNet.
The method redefines adversarial training by mitigating distribution mismatch, thereby boosting both model robustness and generalization without sacrificing clean image performance.

Adversarial Examples Improve Image Recognition: An Overview

The paper "Adversarial Examples Improve Image Recognition" by Xie et al. presents an unconventional approach by leveraging adversarial examples as a means to enhance the performance of image recognition models. This paper introduces AdvProp, an adversarial training strategy that treats adversarial examples as additional data, thus helping to prevent overfitting.

Key Methodology

The AdvProp method redefines adversarial training by implementing a separate auxiliary batch normalization (BN) layer for adversarial examples. This crucial addition addresses the distribution mismatch between adversarial and clean examples, which traditional adversarial training approaches often overlook.

Empirical Findings

The paper demonstrates significant improvements across various models and datasets. Noteworthy results include:

EfficientNet-B7 model showed an enhancement in performance with top-1 accuracy gains of +0.7% on ImageNet, +6.5% on ImageNet-C, +7.0% on ImageNet-A, and +4.8% on Stylized-ImageNet.
AdvProp enables the EfficientNet-B8 model to achieve state-of-the-art 85.5% top-1 accuracy on ImageNet, surpassing models that require substantially more data and parameters.

Robustness and Generalization

The paper highlights the efficacy of AdvProp in boosting model robustness and generalizability. The method addresses discrepancies observed in previous studies where adversarial training commonly led to a decrease in accuracy for clean images. By using separated BNs, AdvProp effectively benefits from adversarial examples without compromising on clean image performance.

Practical Implications

The impressive results suggest that AdvProp is particularly beneficial for larger networks, which can better capitalize on the increased data complexity offered by adversarial examples. The findings advocate for revisiting the conventional adversarial training paradigms and exploring the potential of adversarially-enhanced learning processes.

Theoretical Implications

The proposed two-batchnorm framework introduces a novel dimension to understanding adversarial learning. It provides concrete evidence that disentangling distributional differences through auxiliary parameters can lead to significant performance gains, suggesting further exploration into complex data augmentation strategies and adversarial training approaches.

Future Directions

Looking ahead, the paper opens avenues for further exploration of auxiliary BN structures and their application across other domains. Additionally, scaling this approach to other architectures beyond ConvNets, and exploring the fine-grained disentangled learning approach for maximizing model capacity, could yield further breakthroughs in robust model training strategies.

In summary, this paper contributes a methodologically sound and empirically validated approach to improving image recognition through adversarial examples, challenging the traditional view that adversarial examples are inherently detrimental.

PDF Markdown

Related Papers

GitHub

https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet

YouTube

Show All Videos