Papers
Topics
Authors
Recent
2000 character limit reached

Stronger and Faster Wasserstein Adversarial Attacks

Published 6 Aug 2020 in cs.LG and stat.ML | (2008.02883v1)

Abstract: Deep models, while being extremely flexible and accurate, are surprisingly vulnerable to "small, imperceptible" perturbations known as adversarial attacks. While the majority of existing attacks focus on measuring perturbations under the $\ell_p$ metric, Wasserstein distance, which takes geometry in pixel space into account, has long been known to be a suitable metric for measuring image quality and has recently risen as a compelling alternative to the $\ell_p$ metric in adversarial attacks. However, constructing an effective attack under the Wasserstein metric is computationally much more challenging and calls for better optimization algorithms. We address this gap in two ways: (a) we develop an exact yet efficient projection operator to enable a stronger projected gradient attack; (b) we show that the Frank-Wolfe method equipped with a suitable linear minimization oracle works extremely fast under Wasserstein constraints. Our algorithms not only converge faster but also generate much stronger attacks. For instance, we decrease the accuracy of a residual network on CIFAR-10 to $3.4\%$ within a Wasserstein perturbation ball of radius $0.005$, in contrast to $65.6\%$ using the previous Wasserstein attack based on an \emph{approximate} projection operator. Furthermore, employing our stronger attacks in adversarial training significantly improves the robustness of adversarially trained models.

Citations (32)

Summary

  • The paper introduces an efficient projection operator for PGD and a Frank-Wolfe method to generate Wasserstein adversarial examples.
  • It achieves rapid convergence and reduces ResNet accuracy on CIFAR-10 to 3.4%, significantly outperforming previous approaches.
  • The proposed algorithms are computationally efficient and scalable, enhancing both adversarial evaluation and training on large datasets.

Stronger and Faster Wasserstein Adversarial Attacks

Introduction

The paper explores the vulnerability of deep neural networks to adversarial attacks, specifically focusing on adversarial examples created using the Wasserstein distance. Adversarial examples are small perturbations to input data that can mislead neural networks, raising concerns about their security. Traditional approaches typically utilize the p\ell_p metric to gauge perturbations, which is not always an accurate reflection of similarities in image spaces. Instead, the Wasserstein distance is proposed as it better accounts for geometric information in images, leading to more perceptually realistic adversarial attacks. Nevertheless, computing adversarial examples under the Wasserstein constraint is computationally demanding, thus motivating the need for more efficient algorithms.

Formulation and Contributions

The study addresses the difficulty of creating Wasserstein adversarial examples by introducing novel methods. Key innovations include:

  1. Exact Projection Operator for PGD: The paper presents an efficient projection operator, enhancing the power of the Projected Gradient Descent (PGD) method to generate stronger attacks.
  2. Frank-Wolfe Method with Linear Minimization Oracle: Implementing the Frank-Wolfe algorithm, complemented with a linear minimization oracle, helps accelerate the attack generation process.

These two methods deliver attacks that converge faster while being computationally efficient, making the exploration of large-scale datasets feasible. The practical impact is demonstrated by drastically reducing a ResNet model's accuracy on the CIFAR-10 dataset to 3.4% with a given Wasserstein perturbation limit, outperforming previous attacks which only reduced accuracy to 65.6%.

Practical Implementation

Implementing the proposed techniques involves several steps:

  • Projection Method for PGD: Enhance PGD with a novel projection operator capable of computing exact projections onto the Wasserstein ball. This involves solving a specific quadratic program efficiently, leveraging the dual problem of the projection step to expedite the solution.
  • Frank-Wolfe with Entropic Regularization: Improve Frank-Wolfe's efficiency by incorporating an entropic regularization term, which facilitates the approximation of the linear minimization step. This, in turn, reduces the complexity and iterations required for attack generation.

The algorithms are adaptable to standard GPU infrastructure, exploiting computational efficiencies such as sparsity in the transportation matrix to handle large datasets like ImageNet.

Experimental Results and Observations

The proposed methods were rigorously tested across several datasets—MNIST, CIFAR-10, and ImageNet. Key findings include:

  • Convergence Speed: Both PGD with dual projection and Frank-Wolfe methods significantly outpace the projected Sinkhorn method in terms of convergence speed and achieve stronger attack effectiveness.
  • Trade-offs in Entropic Regularization: The choice of entropic regularization impacts the distortion pattern of the perturbations. Larger regularization may increase perceptual alignment with the original image shape, though at the cost of attack strength. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Loss evolution with respect to iterations on CIFAR-10.

Conclusion

The methods developed in this paper provide significant advancements in generating Wasserstein adversarial attacks that are stronger, faster, and more computationally effective. These advancements are not only useful for evaluating model robustness but also enhance adversarial training, leading to more resilient models against such attacks. The research extends beyond image perturbations to potentially include other applications constrained by Wasserstein distances. This trajectory points towards future developments in certifying and defending against sophisticated adversarial manipulations within various machine learning contexts.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.