AnywhereDoor: Flexible Backdoor Attack
- AnywhereDoor is a multi-target backdoor paradigm for object detection that allows adversaries to remove, misclassify, or generate objects on demand.
- It introduces objective disentanglement, trigger mosaicking, and strategic batching to achieve scalable, robust, and balanced poisoning across classes.
- Experiments on benchmarks like VOC07+12 and COCO show high attack success rates (up to 97.8%) with minimal impact on clean detection performance.
AnywhereDoor is a multi-target backdoor attack paradigm for object detection that grants unprecedented, flexible, inference-time control to adversaries. Unlike prior attacks that are limited to monolithic, pre-defined malicious targets, AnywhereDoor enables attackers to make objects disappear, fabricate synthetic ones, or mislabel target classes at will—either universally or for specific classes—within a single poisoned detector. Its capabilities result from three main innovations: objective disentanglement for scalable multi-behavioral triggers, trigger mosaicking for robustness to region-based detection, and strategic batching to overcome class imbalance in object-centric datasets. Extensive experiments demonstrate that AnywhereDoor achieves high attack success rates (ASRs)—up to 97.8% in untargeted scenarios and an average 26% improvement over competitive multi-trigger baselines—while incurring negligible loss in clean mean average precision (mAP) (Lu et al., 2024, Lu et al., 9 Mar 2025).
1. Threat Model, Attack Scope, and Behavioral Taxonomy
AnywhereDoor assumes a data-poisoning scenario with full adversary control over the detector's training pipeline. The attacker is able to:
- Modify a fraction of training images and their annotations in each mini-batch on-the-fly.
- Embed a learnable trigger generator that produces trigger patterns conditioned on an intent vector .
- At inference, inject a trigger into any input image, accompanied by a selectable intent vector, to activate specific malicious behaviors.
AnywhereDoor supports five composable attack scenarios:
- Untargeted Removal: All objects are suppressed; .
- Targeted Removal: Objects of a chosen class are erased; .
- Untargeted Misclassification: All objects are relabeled to incorrect classes; .
- Targeted Misclassification: A specific class is relabeled as ; .
- Untargeted Generation: Each object in the image is duplicated times with small geometric perturbation; for .
Clean images (without the trigger) maintain nominal detection performance (mAP degradation ) (Lu et al., 2024, Lu et al., 9 Mar 2025).
2. Core Technical Innovations
2.1 Objective Disentanglement
The output space of detection is exponential in the number of classes if each attack-trigger pair is treated independently. AnywhereDoor decomposes the objective with intent embedding:
- Let , where encode per-class removal and generation/misclassification intent.
- The trigger generator is split: and ; the full trigger is .
- The detector and trigger generator are trained jointly via the standard detection loss, with poisoned batches constructed according to the selected .
This modularity linearly scales the number of supported attack configurations, compared to the quadratic scaling of traditional approaches (Lu et al., 2024).
2.2 Trigger Mosaicking
Standard backdoor triggers can be neutralized in region-based detectors (e.g., Faster R-CNN, DETR) if their spatial footprint is localized. Mosaicking addresses this:
- The trigger is generated by and tiled across the input image: .
- Each region, regardless of windowing or cropping, receives a complete, recognizable trigger signature.
- The process controls perturbation magnitude via and guarantees visual stealthiness.
High-level pseudocode:
1 2 3 |
t_prime = sigmoid(t) T = tile(t_prime) # over H x W x_prime = clamp(x + epsilon * T, 0, 1) |
2.3 Strategic Batching
Object detection datasets exhibit severe class imbalances, negatively impacting the efficacy of targeted attacks. Strategic batching introduces adaptive sampling:
- Target classes for poisoning are sampled in proportion to their occurrence frequency, boosting exposure for rare and co-occurring object classes.
- In each mini-batch, a subset comprising images containing the victim class is selected for poisoning.
- The approach mitigates under-poisoning of minority classes and improves overall ASR consistency, particularly in targeted settings.
Algorithmic schema:
1 2 3 4 5 6 7 |
for each batch B: sample class C_t ~ P(class) select S ⊆ B with p*N images containing C_t for x in S: x_prime = f(x, G_phi(e)) Y_prime = P(Y, e) train on (B \ S) ∪ {(x_prime, Y_prime)} |
3. Attack Pipeline: Training and Inference Workflow
AnywhereDoor's end-to-end training involves:
- Strategic Batch Construction: As above, batches are class-balanced and a fraction of images are poisoned according to sampled intents.
- On-the-Fly Poisoning: Each poisoned image receives a mosaicked trigger patch generated from the corresponding intent embedding. Ground-truth annotations are relabeled to reflect the intended adversarial effect.
- Joint Optimization: The detection model and trigger generator are optimized simultaneously with poisoned and clean data.
- Inference-Control: Following training, the adversary can generate arbitrary trigger patterns at inference by specifying the intent vector , gaining on-demand malicious control over the output detection manifold (Lu et al., 9 Mar 2025).
4. Experimental Assessment
4.1 Datasets and Detectors
- Datasets: PASCAL VOC07+12 (20 classes), MS COCO (80 classes, attacks commonly evaluated on five traffic-related classes).
- Architectures: Faster R-CNN (ResNet-50-FPN), DETR (ResNet-50), YOLOv3 (DarkNet-53).
4.2 Metrics
- Clean mAP@50: Post-training accuracy on clean images.
- Attack Success Rate (ASR): Proportion of bounding boxes manipulated according to the attack objective given trigger injection.
4.3 Results
The following table summarizes outcomes for VOC07+12 (Faster R-CNN; additional architectures see (Lu et al., 2024)):
| Model | Clean mAP | Unt. Removal | Tar. Removal | Unt. Miscls. | Tar. Miscls. | Unt. Generation |
|---|---|---|---|---|---|---|
| Faster R-CNN | 76.3 (−1.3) | 97.5% | 86.2% | 97.8% | 80.6% | 88.8% |
| DETR | 78.7 (−0.3) | 96.6% | 91.1% | 99.6% | 83.0% | 98.3% |
| YOLOv3 | 74.5 (+6.6) | 99.9% | 97.5% | 95.6% | 52.2% | 50.7% |
On COCO, untargeted ASRs are 94%, targeted removal ranges 45–50%, targeted misclassification 16–63%, and untargeted generation 56–98%, depending on model (Lu et al., 2024).
AnywhereDoor outperforms both BadNet-style and Marksman/Imperio-style baselines, yielding up to 26% ASR improvement on the hardest (targeted misclassification) scenarios.
5. Robustness, Limitations, and Defensive Analysis
5.1 Robustness and Limitations
- The retention of non-target classes in targeted attacks can be imperfect for highly dominant classes (e.g., "person").
- The attack space currently encompasses five discrete manipulations; complex conditional or geometric attacks (e.g., spatially relocating boxes) remain unsupported.
5.2 Defensive Evaluation
Multiple defense strategies were evaluated:
- Input Sanitizers: JPEG compression, mean/median filtering preserve clean mAP but reduce ASR by less than 5 percentage points, leaving ASR .
- Model Retraining/Fine-tuning: Some reduction in misclassification ASR ( pp), but overall attack effectiveness remains.
- Pruning/Fine-pruning: Drops ASR (down to 9–13%) at the expense of catastrophic mAP collapse (from 76 to 27).
No currently available defense neutralizes AnywhereDoor without unacceptable loss in detection utility (Lu et al., 2024).
6. Implications and Future Directions
AnywhereDoor demonstrates that modern object detectors in critical applications—autonomous driving, surveillance, medical imaging—are susceptible to covert, highly-flexible, context-free backdoor attacks. Its architecture suggests the need for new defenses capable of verifying consistency across existence, localization, and classification hypotheses within object detectors.
Potential future extensions include:
- Context-aware triggers: Triggers that activate preferentially based on scene configuration.
- Physical-world resilience: Trigger patterns robust to viewpoint and illumination variation.
- Defense development: Joint detection/sanitization pipelines, anomaly detection focusing on patch region self-similarity, and provable robustness for multi-task (existence, localization, label) settings (Lu et al., 2024, Lu et al., 9 Mar 2025).