An Alternative Surrogate Loss for PGD-based Adversarial Testing (1910.09338v1)

Published 21 Oct 2019 in cs.LG and stat.ML

Abstract: Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified. This paper takes a deeper look at these methods and explains the effect of different hyperparameters (i.e., optimizer, step size and surrogate loss). We introduce the concept of MultiTargeted testing, which makes clever use of alternative surrogate losses, and explain when and how MultiTargeted is guaranteed to find optimal perturbations. Finally, we demonstrate that MultiTargeted outperforms more sophisticated methods and often requires less iterative steps than other variants of PGD found in the literature. Notably, MultiTargeted ranks first on MadryLab's white-box MNIST and CIFAR-10 leaderboards, reducing the accuracy of their MNIST model to 88.36% (with $\ell_\infty$ perturbations of $\epsilon = 0.3$) and the accuracy of their CIFAR-10 model to 44.03% (at $\epsilon = 8/255$). MultiTargeted also ranks first on the TRADES leaderboard reducing the accuracy of their CIFAR-10 model to 53.07% (with $\ell_\infty$ perturbations of $\epsilon = 0.031$).

Citations (88)

View on Semantic Scholar

Summary

The paper proposes MultiTargeted testing by combining multiple surrogate losses to improve PGD-based adversarial evaluations.
It provides detailed hyperparameter tuning insights, elucidating choices like optimizer, step size, and loss functions for robust attack performance.
Empirical results show its effectiveness, with state-of-the-art reductions in model accuracy on benchmarks like Mnist and Cifar-10.

An Analysis of "An Alternative Surrogate Loss for PGD-based Adversarial Testing"

The paper introduces a strategic enhancement to Projected Gradient Descent (PGD) adversarial testing, widely used in evaluating the robustness of neural networks against adversarial perturbations. The focus is on revisiting the loss functions employed within these methods, specifically proposing an alternative strategy named MultiTargeted testing. This approach leverages multiple surrogate losses, demonstrating not only the theoretical underpinnings but also the practical superiority of this method over conventional PGD variants across multiple datasets.

Key Contributions

Hyperparameter Insights for PGD: The paper meticulously dissects the choices of optimizer, step size, and surrogate loss in the PGD framework, offering a comprehensive guide on parameter tuning. Such insights are crucial as they affect the success rate of adversarial attacks significantly.
MultiTargeted Testing: The novel MultiTargeted testing approach diverges from traditional single surrogate loss methods by employing a series of targeted logit differences across restarts. This strategy ensures a more thorough exploration of the threat model's hypothesis space, thereby optimizing the adversarial perturbation discovery process.
Empirical Validation: Strong numerical results substantiate the claims. Specifically, on notable benchmarks like the MadryLab's white-box Mnist and Cifar-10 leaderboards, MultiTargeted ranks first, reducing Mnist model accuracy to 88.36% with an epsilon value of 0.3 and Cifar-10 model accuracy to 44.03% with an epsilon value of 8/255. It also achieves top performance on the TRADES leaderboard.

Analytical Insights

The authors provide an in-depth analysis of the PGD variants, demonstrating that while methods like FGSM^K may offer baseline robustness, they become suboptimal when confronted with more sophisticated models and datasets. Experimentation with both locally and globally linear models highlights scenarios where MultiTargeted excels in finding the optimal attacks more efficiently than standard PGD.

Practical and Theoretical Implications

Practically, MultiTargeted testing can be directly applied to enhance the robustness evaluations of neural networks, ensuring more reliable adversarial tests. Theoretically, the exploration of convex adversarial input sets and their propagated counterparts offers deep insights into the attack surfaces of neural networks, fostering future research on model interpretability and robustness.

Speculations on Future Developments

As neural architectures evolve, integrating strategies such as MultiTargeted could redefine adversarial testing paradigms. Addressing the non-convexity in deeper models and extending such techniques to more complex threat models and specifications may further refine robustness evaluation methods. The potential blending of these strategies with more advanced optimizers or novel loss functions remains an intriguing avenue for future investigation.

This paper's structured approach towards comprehending and exploiting surrogate losses within the PGD framework underscores its contribution to advancing adversarial robustness evaluation. The results and methodologies proposed hold promise for both immediate application and as a foundation for future exploratory work in the domain of AI security and robustness.

PDF Markdown

Related Papers

GitHub

GitHub - yaodongyu/TRADES: TRADES (TRadeoff-inspired Adversarial DEfense via Surrogate-loss minimization) (543 stars)