2000 character limit reached
Locally optimal detection of stochastic targeted universal adversarial perturbations (2012.04692v1)
Published 8 Dec 2020 in cs.CV
Abstract: Deep learning image classifiers are known to be vulnerable to small adversarial perturbations of input images. In this paper, we derive the locally optimal generalized likelihood ratio test (LO-GLRT) based detector for detecting stochastic targeted universal adversarial perturbations (UAPs) of the classifier inputs. We also describe a supervised training method to learn the detector's parameters, and demonstrate better performance of the detector compared to other detection methods on several popular image classification datasets.