DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation (2012.07006v2)

Published 13 Dec 2020 in cs.CR and cs.LG

Abstract: Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.

Citations (173)

View on Semantic Scholar

Summary

The paper presents DeepSweep, an evaluation framework leveraging a two-stage data augmentation strategy to effectively counter backdoor attacks in DNNs.
DeepSweep significantly reduces the Attack Success Rate (ASR) across various backdoor types while preserving clean data accuracy, demonstrating broad applicability.
The DeepSweep framework demonstrates superior robustness and generalizability against diverse backdoor attacks compared to previous defense strategies.

Overview of "DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation"

The paper, titled "DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation," presents a comprehensive evaluation and defense strategy against backdoor attacks in Deep Neural Networks (DNNs). Backdoor attacks are a significant threat to model integrity, where adversaries introduce malicious alterations that activate under specific conditions, such as specific input patterns or triggers. This research addresses the insufficiency of existing defense mechanisms by introducing a novel framework that employs data augmentation as a means to both identify and counteract backdoor vulnerabilities effectively.

The authors propose the DeepSweep framework, which evaluates the effectiveness of data augmentation strategies to mitigate backdoor attacks in DNNs. The framework operates through a two-pronged approach: first, leveraging data augmentation to fine-tune model weights, thereby weakening the influence of backdoors embedded within the model; and second, pre-processing input data with augmentation functions during inference to disrupt potential trigger patterns. This dual-stage defense mechanism is fundamental to altering both the model's learned decision boundaries and the trigger activation pathways.

A significant aspect of the DeepSweep framework is its systematic methodology for identifying the optimal combination of augmentation functions from a pre-defined library. This process involves comprehensively evaluating a variety of data augmentation techniques, including conventional image transformations and advanced defenses known from adversarial machine learning. The framework traverses a search set of known backdoor attacks to refine the policies that achieve optimal performance — notably in reducing the attack success rate (ASR) while maintaining model performance on clean data.

In the extensive evaluation presented in the research, the authors demonstrate that the two-stage augmentation strategy significantly reduces ASR across multiple backdoor types, including BadNets, Trojan attacks, and both visible and imperceptible triggers, while simultaneously preserving model accuracy. The proposed solution is tested against eight diverse backdoor scenarios, revealing universal applicability and effectiveness as a benchmark tool for DNN robustness enhancement.

Compared to contemporary defenses, such as Neural Cleanse or Fine-Pruning, DeepSweep exhibits superior robustness and generalizability. Traditional defenses often constrain the attack scenario assumptions or present limited flexibility across diverse attack configurations. DeepSweep's adaptive policy discovery and integration extend the state-of-the-art by ensuring broad-spectrum defense suitability without compromising computational practicality.

In conclusion, DeepSweep stands as a pioneering framework facilitating the development of robust DNNs against entrenched and emerging backdoor threats. By combining data augmentation with an intelligent policy search approach, this research delineates a versatile path forward in the field of AI security. The release of DeepSweep as an open-source tool encourages collaborative refinement and exploration of defense strategies, potentially driving future innovations in securing machine learning applications.

DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation (2012.07006v2)

Summary

Overview of "DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation"

Related Papers