EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples (1709.04114v3)

Published 13 Sep 2017 in stat.ML, cs.CR, and cs.LG

Abstract: Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on $L_2$ and $L_\infty$ distortion metrics. However, despite the fact that $L_1$ distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting $L_1$-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art $L_2$ attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small $L_1$ distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging $L_1$ distortion in adversarial machine learning and security implications of DNNs.

Citations (611)

View on Semantic Scholar

Collections

Sign up for free to add this paper to one or more collections.

Sign Up

Summary

The paper introduces EAD, a method that leverages elastic-net regularization to blend L1 and L2 norms for generating sparse adversarial examples.
The approach reframes adversarial generation as an optimization task that achieves high success rates while reducing L1 distortions on datasets like MNIST, CIFAR10, and ImageNet.
EAD significantly improves attack transferability across diverse network architectures, prompting a reexamination of defense strategies focused solely on L2 and L∞ metrics.

Overview of EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

This paper introduces the Elastic-Net Attacks to Deep Neural Networks (EAD) as a novel technique in crafting adversarial examples, focusing on $L_1$ -oriented perturbations. The current landscape of generating adversarial examples is dominated by approaches leveraging $L_2$ and $L_\infty$ distortion metrics. These metrics inform the creation of adversarial perturbations that are virtually indistinguishable to human observers but can disrupt the accurate predictions of state-of-the-art deep neural networks (DNNs). This paper identifies the sparsity-promoting and total variation attributes inherent in $L_1$ distortion as underexploited opportunities for advancing adversarial example creation.

EAD reframes the adversarial example generation problem as an optimization task incorporating elastic-net regularization, characterized by a blend of $L_1$ and $L_2$ penalty terms. Thus, EAD includes existing $L_2$ attacks as a particular scenario by nullifying the $L_1$ component. This methodological expansion enables the construction of adversarial instances that display comparable attack success rates to other state-of-the-art techniques across various datasets, including MNIST, CIFAR10, and ImageNet. The core innovation within EAD does not solely rest on the efficacy of attacks but extends to notable enhancements in attack transferability particularly among DNN models and their variations, such as defensively distilled networks.

Experimental Evaluation

Extensive experimentation underscores the efficacy of EAD. The authors conducted tests using DNNs trained on MNIST, CIFAR10, and ImageNet, comparing EAD with leading methods such as Carlini and Wagner's $L_2$ attack (C&W), as well as variants of the fast gradient method (FGM) and iterative FGM (I-FGM) attacks. The results indicate EAD's capacity to maintain attack success rates commensurate with alternatives yet achieve lower $L_1$ distortions, meaningfully differentiating items in attack diversity and robustness analysis.

Remarkably, EAD enhances transfer attack success rates, achieving markedly higher rates than $L_2$ -focused methods when adversarial examples are employed across different neural architectures. This insight not only showcases the method's potency but also hints at the broader implications surrounding the adoption of $L_1$ -centric perspectives within adversarial learning frameworks.

Implications and Future Directions

The introduction of EAD broadens the toolbox available for adversarial machine learning researchers, providing a new lens through which to explore model vulnerabilities and resiliences. From a security standpoint, the approach suggests that existing defenses predicated on minimizing $L_2$ and $L_\infty$ metrics may require reevaluation to counteract $L_1$ biassed perturbations effectively.

In future work, examining the intersection of EAD with adversarial defenses, particularly those that leverage multiple types of adversarial examples (such as hybrid adversarial training), presents an intriguing avenue for further research. Such explorations could illuminate strategies for enhancing model robustness in deployment scenarios where adversarial threats are an operational reality.

This paper, therefore, contributes significantly to adversarial learning by blazing a trail for the incorporation of elastic-net regularization principles—commonplace in feature selection—into the adversarial example synthesis domain, furthering our understanding of DNN susceptibility and resilience.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now