Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Published 26 Jul 2018 in stat.ML, cs.CR, cs.CV, and cs.LG | (1807.10272v2)

Abstract: We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% accuracy in the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack, which may offer insights into the reasons underlying the vulnerability of ALP to adversarial attack.