How Deep Learning Sees the World: A Survey on Adversarial Attacks & Defenses (2305.10862v1)

Published 18 May 2023 in cs.CV

Abstract: Deep Learning is currently used to perform multiple tasks, such as object recognition, face recognition, and natural language processing. However, Deep Neural Networks (DNNs) are vulnerable to perturbations that alter the network prediction (adversarial examples), raising concerns regarding its usage in critical areas, such as self-driving vehicles, malware detection, and healthcare. This paper compiles the most recent adversarial attacks, grouped by the attacker capacity, and modern defenses clustered by protection strategies. We also present the new advances regarding Vision Transformers, summarize the datasets and metrics used in the context of adversarial settings, and compare the state-of-the-art results under different attacks, finishing with the identification of open issues.

Authors (4)

Joana C. Costa (6 papers)
Tiago Roxo (8 papers)
Hugo Proença (31 papers)
Pedro R. M. Inácio (6 papers)

Citations (24)

View on Semantic Scholar

Summary

An Insightful Overview of "How Deep Learning Sees the World: A Survey on Adversarial Attacks and Defenses"

The paper "How Deep Learning Sees the World: A Survey on Adversarial Attacks and Defenses" offers a comprehensive exploration of adversarial attacks and defenses related to Deep Neural Networks (DNNs). The work is grounded in the premise that despite the impressive capabilities of DNNs in areas like object and face recognition, these models are particularly susceptible to adversarial attacks, which introduce subtle perturbations to input data, significantly altering the model's output.

For the purpose of the survey, the authors, Joana C. Costa et al., present a structured synthesis of adversarial techniques across various dimensions, aimed at providing a groundwork for understanding current challenges in adversarial contexts. Their investigation categorizes adversarial attacks by the knowledge available to attackers into two principal types: white-box, where the attacker has complete visibility into the model architecture and possibly its training data, and black-box attacks, which rely on limited knowledge, often restricted to the model's outputs.

White-box vs Black-box Adversarial Attacks

White-box attacks are characterized by the attacker's complete access to the DNN, allowing precise crafting of adversarial examples. Key methodologies like the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD), among others, are discussed in detail concerning their mechanisms and efficacy. Conversely, black-box attacks, though often considered more realistic for real-world scenarios, require innovative strategies to estimate the model's gradient or attack through alternative channels, with techniques such as query-based methods and feature-guided attacks being exemplified.

Defense Mechanisms

The survey further delineates adversarial defenses into several categories, each addressing different aspects of the threat model. These include adversarial training, which involves augmenting the training process with adversarial examples to improve the model's robustness; modifying the training process or network architectures to intrinsically resist adversarial perturbations; using supplementary networks that act as filters for adversarial inputs; and performing regular testing and validation to ensure model integrity.

Adversarial training remains one of the most prominent and effective methods discussed, achieving significant robustness by continually exposing the model to adversarial inputs during its learning phase. The authors highlight the necessity of such ongoing adaptation, as traditional static defenses may not suffice given the evolving nature of adversarial techniques.

Implications and Future Directions

The implications of this survey are manifold. Practically speaking, advancing robust defense mechanisms is critical for the secure deployment of DNNs in sensitive applications such as self-driving vehicles and healthcare, where erroneous predictions could have severe consequences. Theoretically, the work underscores ongoing vulnerabilities within current models and calls for innovative solutions that can transcend current limitations, including but not limited to better understanding the adversarial space and developing inherently robust architectures.

The paper concludes by identifying open research areas, urging an expansion of adversarial robustness research beyond commonly used datasets like CIFAR-10 and MNIST, towards more complex datasets such as ImageNet. It also suggests a focus on black-box attack methodologies and the exploration of Vision Transformers' (ViTs) robustness to adversarial attacks.

Through an organized presentation of existing literature, the researchers provide an invaluable resource for fellow experts looking to dive deeper into the adversarial paradigm. Whether for application in novel scenarios or academic advancement, this paper lays a foundation from which future contributions to adversarial defense and transformative AI technologies can further evolve.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos