Thundernna: a white box adversarial attack (2111.12305v2)
Abstract: The existing work shows that the neural network trained by naive gradient-based optimization method is prone to adversarial attacks, adds small malicious on the ordinary input is enough to make the neural network wrong. At the same time, the attack against a neural network is the key to improving its robustness. The training against adversarial examples can make neural networks resist some kinds of adversarial attacks. At the same time, the adversarial attack against a neural network can also reveal some characteristics of the neural network, a complex high-dimensional non-linear function, as discussed in previous work. In This project, we develop a first-order method to attack the neural network. Compare with other first-order attacks, our method has a much higher success rate. Furthermore, it is much faster than second-order attacks and multi-steps first-order attacks.
- “Second-order adversarial attack and certifiable robustness,” 2019.
- Christian Szegedy. Wojciech Zaremba. Ilya Sutskever. Joan Bruna. Dumitru Erhan. Ian J. Goodfellow. and Rob Fergus, “Intriguing properties of neural networks,” 2014.
- Clune J. Nguyen A, Yosinski J, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” 2015.
- Anish Athalye. Logan Engstrom. Andrew Ilyas. and Kevin Kwok., “Synthesizing robust adversarial examples.,” 2018.
- Seyed-Mohsen Moosavi-Dezfooli. Alhussein Fawzi. Omar Fawzi. and Pascal Frossard, “Universal adversarial perturbations.,” 2017.
- Konda Reddy Mopuri. Utsav Garg. and R. Venkatesh Babu., “Fast feature fool: A data independent approach to universal adversarial perturbations.,” 2017.
- Konda Reddy Mopuri. Aditya Ganeshan. and R. Venkatesh Babu., “Generalizable data-free objective for crafting universal adversarial perturbations.,” 2018.
- Jonathon Shlens Ian J. Goodfellow and Christian Szegedy., “Explaining and harnessing adversarial examples.,” 2015.
- Anh Mai Nguyen. Jason Yosinski. and Jeff Clune, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.,” 2015.
- Alhussein Fawzi. Seyed-Mohsen Moosavi-Dezfooli. and Pascal Frossard. Robustness of classifiers: from adversarial to random noise., “robustness of classifiers: from adversarial to random noise.,” 2016.
- Alhussein Fawzi. Omar Fawzi. and Pascal Frossard., “Analysis of classifiers robustness to adversarial perturbations.,” 2018.
- Alexey Kurakin. Ian J. Goodfellow. and Samy Bengio, “Adversarial machine learning at scale,” 2016.
- Nicolas Papernot. Patrick D. McDaniel. Somesh Jha. Matt Fredrikson. Z. Berkay Celik. and Ananthram Swami., “The limitations of deep learning in adversarial settings.,” 2016.
- Jiawei Su. Danilo Vasconcellos Vargas. and Kouichi Sakurai., “One pixel attack for fooling deep neural networks.,” 2017.
- “Towards evaluating the robustness of neural networks.,” 2017.
- Seyed-Mohsen Moosavi-Dezfooli. Alhussein Fawzi. and Pascal Frossard., “Deepfool: A simple and accurate method to fool deep neural networks.,” 2016.
- Alexey Kurakin. Ian J. Goodfellow. and Samy Bengio, “Adversarial examples in the physical world,” 2016.
- Jan Hendrik Metzen. Mummadi Chaithanya Kumar. Thomas Brox. and Volker Fischer., “Universal adversarial perturbations against semantic image segmentation,” 2017.
- Mian A . Akhtar N, “Threat of adversarial attacks on deep learning in computer vision: A survey,” 2018.