Attack as Defense: Characterizing Adversarial Examples using Robustness (2103.07633v1)

Published 13 Mar 2021 in cs.CR, cs.AI, and cs.SE

Abstract: As a new programming paradigm, deep learning has expanded its application to many real-world problems. At the same time, deep learning based software are found to be vulnerable to adversarial attacks. Though various defense mechanisms have been proposed to improve robustness of deep learning software, many of them are ineffective against adaptive attacks. In this work, we propose a novel characterization to distinguish adversarial examples from benign ones based on the observation that adversarial examples are significantly less robust than benign ones. As existing robustness measurement does not scale to large networks, we propose a novel defense framework, named attack as defense (A2D), to detect adversarial examples by effectively evaluating an example's robustness. A2D uses the cost of attacking an input for robustness evaluation and identifies those less robust examples as adversarial since less robust examples are easier to attack. Extensive experiment results on MNIST, CIFAR10 and ImageNet show that A2D is more effective than recent promising approaches. We also evaluate our defence against potential adaptive attacks and show that A2D is effective in defending carefully designed adaptive attacks, e.g., the attack success rate drops to 0% on CIFAR10.

Authors (6)

Zhe Zhao (97 papers)
Guangke Chen (11 papers)
Jingyi Wang (105 papers)
Yiwei Yang (20 papers)
Fu Song (37 papers)
Jun Sun (210 papers)

Citations (29)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Attack as Defense: Characterizing Adversarial Examples using Robustness (2103.07633v1)

Summary

Related Papers