SFIBA: Spatial-based Full-target Invisible Backdoor Attacks (2504.21052v1)

Published 29 Apr 2025 in cs.CR and cs.AI

Abstract: Multi-target backdoor attacks pose significant security threats to deep neural networks, as they can preset multiple target classes through a single backdoor injection. This allows attackers to control the model to misclassify poisoned samples with triggers into any desired target class during inference, exhibiting superior attack performance compared with conventional backdoor attacks. However, existing multi-target backdoor attacks fail to guarantee trigger specificity and stealthiness in black-box settings, resulting in two main issues. First, they are unable to simultaneously target all classes when only training data can be manipulated, limiting their effectiveness in realistic attack scenarios. Second, the triggers often lack visual imperceptibility, making poisoned samples easy to detect. To address these problems, we propose a Spatial-based Full-target Invisible Backdoor Attack, called SFIBA. It restricts triggers for different classes to specific local spatial regions and morphologies in the pixel space to ensure specificity, while employing a frequency-domain-based trigger injection method to guarantee stealthiness. Specifically, for injection of each trigger, we first apply fast fourier transform to obtain the amplitude spectrum of clean samples in local spatial regions. Then, we employ discrete wavelet transform to extract the features from the amplitude spectrum and use singular value decomposition to integrate the trigger. Subsequently, we selectively filter parts of the trigger in pixel space to implement trigger morphology constraints and adjust injection coefficients based on visual effects. We conduct experiments on multiple datasets and models. The results demonstrate that SFIBA can achieve excellent attack performance and stealthiness, while preserving the model's performance on benign samples, and can also bypass existing backdoor defenses.

Summary

The paper introduces SFIBA, an innovative full-target backdoor attack that injects spatially localized triggers invisibly to manipulate deep neural networks.
It employs a frequency-domain approach using FFT, DWT, and SVD to achieve high attack success rates while preserving benign performance.
The method demonstrates robust evasion of defenses like Fine-Pruning and Neural Cleanse, highlighting significant challenges in DNN security.

A Technical Overview of "SFIBA: Spatial-based Full-target Invisible Backdoor Attacks"

This paper presents a novel approach to backdoor attacks on deep neural networks (DNNs), specifically targeting the ability of such attacks to be both comprehensive and invisible. The main contribution is the introduction of SFIBA, a Spatial-based Full-target Invisible Backdoor Attack, designed to effectively manipulate DNNs under black-box conditions. This essay dissects the methodology, results, and implications of this technique for researchers and practitioners interested in the security of machine learning systems.

Introduction and Problem Statement

Backdoor attacks aim to compromise DNNs by injecting malicious triggers, causing models to misclassify inputs during inference. Traditional methods often focus on single-target attacks, limiting scope and effectiveness. Multi-target attacks, capable of redirecting inputs to multiple classes, offer a broader attack surface but face challenges in trigger specificity and stealthiness, especially under black-box restrictions where only training data can be modified.

SFIBA addresses these challenges by ensuring that triggers are seamlessly integrated into the DNN without visual detectability. The method confines triggers to localized spatial regions within the pixel space and applies a frequency-domain-based approach to maintain stealthiness. This strategy ensures powerful backdoor payloads while preserving the model's performance on benign samples.

Figure 1: Schematic of multi-target backdoor attack.

Methodology

The proposed SFIBA technique is predicated on two significant advancements: spatial localization of trigger regions and morphological constraints, alongside a sophisticated frequency-domain trigger injection methodology.

Spatial Localization and Morphological Constraints

SFIBA manipulates the trigger's spatial attributes by dividing images into isolated blocks, each associated with a specific class. The assignment of unique triggers to distinct spatial blocks ensures non-overlapping influence, mitigating interference among multiple backdoor injections.

Frequency-Domain Trigger Injection

The novel frequency-domain methodology leverages Fast Fourier Transform (FFT) to convert pixel space data into amplitude and phase spectrums. Subsequently, Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD) are employed to inject triggers stealthily into amplitude features, maintaining an equilibrium between trigger efficacy and visual quality. These techniques ensure that the backdoor effects are invisible in the pixel space, which is crucial in avoiding detection by visual inspection or simple filtering mechanisms.

Figure 2: SFIBA's attack process, where AS represents the amplitude spectrum.

Experimental Results and Evaluation

The efficacy of SFIBA is validated across diverse datasets, including CIFAR10, GTSRB, and ImageNet100, using models like PreActResNet18 and VGG19. Results highlight that SFIBA outperforms existing multi-target backdoor attacks in both full-target capacity and visual stealthiness.

Effectiveness and Stealthiness: SFIBA achieves high Attack Success Rates (ASR) while maintaining benign accuracy (BA), demonstrating robustness even under stringent data augmentation techniques. In terms of stealthiness, the metric evaluations including PSNR, SSIM, and LPIPS illustrate SFIBA's superiority in maintaining low detectability of poisonous samples.

Figure 3: Visual effects and residuals of SFIBA and baselines on ImageNet.

Defense Against Mitigation Techniques

SFIBA's resilience against various state-of-the-art defense mechanisms—such as Fine-Pruning, Neural Cleanse, and STRIP—shows its robustness. It effectively bypasses these defenses, mainly due to its innovative approach to spatial and frequency-domain transformations of trigger placements.

Implications and Future Directions

The introduction of SFIBA raises significant concerns about the security of DNNs, particularly in applications involving sensitive data or requiring high integrity. Its ability to perform under black-box settings suggests that current defense techniques must evolve to detect such sophisticated backdoor strategies. Future research should focus on developing advanced detection methodologies capable of recognizing these nuanced attacks, potentially involving more intricate analysis of frequency-domain characteristics or deploying alternative learning paradigms that are inherently resistant to such vulnerabilities.

In conclusion, this work provides a robust framework for understanding and executing invisible, full-target backdoor attacks while setting a new standard for evaluating DNN security vulnerabilities. The implications of SFIBA extend beyond theoretical research, necessitating practical advancements in defense strategies to safeguard the next generation of neural network deployments.