Efficient Backdoor Removal Through Natural Gradient Fine-tuning (2306.17441v1)

Published 30 Jun 2023 in cs.CV and eess.IV

Abstract: The success of a deep neural network (DNN) heavily relies on the details of the training scheme; e.g., training data, architectures, hyper-parameters, etc. Recent backdoor attacks suggest that an adversary can take advantage of such training details and compromise the integrity of a DNN. Our studies show that a backdoor model is usually optimized to a bad local minima, i.e. sharper minima as compared to a benign model. Intuitively, a backdoor model can be purified by reoptimizing the model to a smoother minima through fine-tuning with a few clean validation data. However, fine-tuning all DNN parameters often requires huge computational costs and often results in sub-par clean test performance. To address this concern, we propose a novel backdoor purification technique, Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by fine-tuning only one layer. Specifically, NGF utilizes a loss surface geometry-aware optimizer that can successfully overcome the challenge of reaching a smooth minima under a one-layer optimization scenario. To enhance the generalization performance of our proposed method, we introduce a clean data distribution-aware regularizer based on the knowledge of loss surface curvature matrix, i.e., Fisher Information Matrix. Extensive experiments show that the proposed method achieves state-of-the-art performance on a wide range of backdoor defense benchmarks: four different datasets- CIFAR10, GTSRB, Tiny-ImageNet, and ImageNet; 13 recent backdoor attacks, e.g. Blend, Dynamic, WaNet, ISSBA, etc.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Nazmul Karim (21 papers)
Abdullah Al Arafat (4 papers)
Umar Khalid (18 papers)
Zhishan Guo (12 papers)
Naznin Rahnavard (1 paper)

Citations (1)

View on Semantic Scholar

Efficient Backdoor Removal Through Natural Gradient Fine-tuning (2306.17441v1)

Related Papers