Papers
Topics
Authors
Recent
Search
2000 character limit reached

AnywhereDoor: Flexible Backdoor Attack

Updated 3 March 2026
  • AnywhereDoor is a multi-target backdoor paradigm for object detection that allows adversaries to remove, misclassify, or generate objects on demand.
  • It introduces objective disentanglement, trigger mosaicking, and strategic batching to achieve scalable, robust, and balanced poisoning across classes.
  • Experiments on benchmarks like VOC07+12 and COCO show high attack success rates (up to 97.8%) with minimal impact on clean detection performance.

AnywhereDoor is a multi-target backdoor attack paradigm for object detection that grants unprecedented, flexible, inference-time control to adversaries. Unlike prior attacks that are limited to monolithic, pre-defined malicious targets, AnywhereDoor enables attackers to make objects disappear, fabricate synthetic ones, or mislabel target classes at will—either universally or for specific classes—within a single poisoned detector. Its capabilities result from three main innovations: objective disentanglement for scalable multi-behavioral triggers, trigger mosaicking for robustness to region-based detection, and strategic batching to overcome class imbalance in object-centric datasets. Extensive experiments demonstrate that AnywhereDoor achieves high attack success rates (ASRs)—up to 97.8% in untargeted scenarios and an average 26% improvement over competitive multi-trigger baselines—while incurring negligible loss in clean mean average precision (mAP) (Lu et al., 2024, Lu et al., 9 Mar 2025).

1. Threat Model, Attack Scope, and Behavioral Taxonomy

AnywhereDoor assumes a data-poisoning scenario with full adversary control over the detector's training pipeline. The attacker is able to:

  • Modify a fraction pp of training images and their annotations in each mini-batch on-the-fly.
  • Embed a learnable trigger generator GÏ•G_\phi that produces trigger patterns conditioned on an intent vector e\mathbf{e}.
  • At inference, inject a trigger into any input image, accompanied by a selectable intent vector, to activate specific malicious behaviors.

AnywhereDoor supports five composable attack scenarios:

  1. Untargeted Removal: All objects are suppressed; Y′=∅Y' = \emptyset.
  2. Targeted Removal: Objects of a chosen class CtC_t are erased; Y′={(Bi,Ci)∣Ci≠Ct}Y' = \{(B_i, C_i) \mid C_i \neq C_t\}.
  3. Untargeted Misclassification: All objects are relabeled to incorrect classes; Y′={(Bi,(Ci mod m)+1)}Y' = \{(B_i, (C_i \bmod m)+1)\}.
  4. Targeted Misclassification: A specific class CtC_t is relabeled as Ct′C_{t'}; Y′={(Bi,Ct′)∣Ci=Ct}∪{(Bi,Ci)∣Ci≠Ct}Y' = \{(B_i, C_{t'}) \mid C_i = C_t\} \cup \{(B_i, C_i) \mid C_i \neq C_t\}.
  5. Untargeted Generation: Each object in the image is duplicated KK times with small geometric perturbation; Y′={(Bi+ΔBik,Ci)}Y' = \{(B_i+\Delta B_i^k, C_i)\} for k=1…Kk=1\ldots K.

Clean images (without the trigger) maintain nominal detection performance (mAP degradation ≪1%\ll 1\%) (Lu et al., 2024, Lu et al., 9 Mar 2025).

2. Core Technical Innovations

2.1 Objective Disentanglement

The output space of detection is exponential in the number of classes if each attack-trigger pair is treated independently. AnywhereDoor decomposes the objective with intent embedding:

  • Let e=[er, eg]\mathbf{e} = [\mathbf{e}_r,\,\mathbf{e}_g], where er, eg∈{0,1}m\mathbf{e}_r,\,\mathbf{e}_g \in \{0,1\}^m encode per-class removal and generation/misclassification intent.
  • The trigger generator is split: tr=GÏ•r(er)\mathbf{t}_r = G_{\phi_r}(\mathbf{e}_r) and tg=GÏ•g(eg)\mathbf{t}_g = G_{\phi_g}(\mathbf{e}_g); the full trigger is t=tr+tg\mathbf{t} = \mathbf{t}_r + \mathbf{t}_g.
  • The detector and trigger generator are trained jointly via the standard detection loss, with poisoned batches constructed according to the selected e\mathbf{e}.

This modularity linearly scales the number of supported attack configurations, compared to the quadratic scaling of traditional approaches (Lu et al., 2024).

2.2 Trigger Mosaicking

Standard backdoor triggers can be neutralized in region-based detectors (e.g., Faster R-CNN, DETR) if their spatial footprint is localized. Mosaicking addresses this:

  • The trigger t∈R3×h×wt \in \mathbb{R}^{3 \times h \times w} is generated by GÏ•G_\phi and tiled across the input image: x′=Π[0,1][x+Γ(ϵ⋅σ(t))]x' = \Pi_{[0,1]} \left[ x + \Gamma(\epsilon \cdot \sigma(t)) \right].
  • Each region, regardless of windowing or cropping, receives a complete, recognizable trigger signature.
  • The process controls perturbation magnitude via ϵ\epsilon and guarantees visual stealthiness.

High-level pseudocode:

1
2
3
t_prime = sigmoid(t)
T = tile(t_prime)    # over H x W
x_prime = clamp(x + epsilon * T, 0, 1)
This mechanism ensures trigger persistence under the input transformations inherent in region proposal networks (Lu et al., 2024, Lu et al., 9 Mar 2025).

2.3 Strategic Batching

Object detection datasets exhibit severe class imbalances, negatively impacting the efficacy of targeted attacks. Strategic batching introduces adaptive sampling:

  • Target classes for poisoning are sampled in proportion to their occurrence frequency, boosting exposure for rare and co-occurring object classes.
  • In each mini-batch, a subset comprising pâ‹…Np \cdot N images containing the victim class is selected for poisoning.
  • The approach mitigates under-poisoning of minority classes and improves overall ASR consistency, particularly in targeted settings.

Algorithmic schema:

1
2
3
4
5
6
7
for each batch B:
    sample class C_t ~ P(class)
    select S ⊆ B with p*N images containing C_t
    for x in S:
        x_prime = f(x, G_phi(e))
        Y_prime = P(Y, e)
    train on (B \ S) ∪ {(x_prime, Y_prime)}
Ablation results confirm that without strategic batching, ASRs for rare classes are significantly diminished (Lu et al., 2024).

3. Attack Pipeline: Training and Inference Workflow

AnywhereDoor's end-to-end training involves:

  1. Strategic Batch Construction: As above, batches are class-balanced and a fraction of images are poisoned according to sampled intents.
  2. On-the-Fly Poisoning: Each poisoned image receives a mosaicked trigger patch generated from the corresponding intent embedding. Ground-truth annotations are relabeled to reflect the intended adversarial effect.
  3. Joint Optimization: The detection model and trigger generator are optimized simultaneously with poisoned and clean data.
  4. Inference-Control: Following training, the adversary can generate arbitrary trigger patterns at inference by specifying the intent vector e\mathbf{e}, gaining on-demand malicious control over the output detection manifold (Lu et al., 9 Mar 2025).

4. Experimental Assessment

4.1 Datasets and Detectors

  • Datasets: PASCAL VOC07+12 (20 classes), MS COCO (80 classes, attacks commonly evaluated on five traffic-related classes).
  • Architectures: Faster R-CNN (ResNet-50-FPN), DETR (ResNet-50), YOLOv3 (DarkNet-53).

4.2 Metrics

  • Clean mAP@50: Post-training accuracy on clean images.
  • Attack Success Rate (ASR): Proportion of bounding boxes manipulated according to the attack objective given trigger injection.

4.3 Results

The following table summarizes outcomes for VOC07+12 (Faster R-CNN; additional architectures see (Lu et al., 2024)):

Model Clean mAP Unt. Removal Tar. Removal Unt. Miscls. Tar. Miscls. Unt. Generation
Faster R-CNN 76.3 (−1.3) 97.5% 86.2% 97.8% 80.6% 88.8%
DETR 78.7 (−0.3) 96.6% 91.1% 99.6% 83.0% 98.3%
YOLOv3 74.5 (+6.6) 99.9% 97.5% 95.6% 52.2% 50.7%

On COCO, untargeted ASRs are >>94%, targeted removal ranges 45–50%, targeted misclassification 16–63%, and untargeted generation 56–98%, depending on model (Lu et al., 2024).

AnywhereDoor outperforms both BadNet-style and Marksman/Imperio-style baselines, yielding up to 26% ASR improvement on the hardest (targeted misclassification) scenarios.

5. Robustness, Limitations, and Defensive Analysis

5.1 Robustness and Limitations

  • The retention of non-target classes in targeted attacks can be imperfect for highly dominant classes (e.g., "person").
  • The attack space currently encompasses five discrete manipulations; complex conditional or geometric attacks (e.g., spatially relocating boxes) remain unsupported.

5.2 Defensive Evaluation

Multiple defense strategies were evaluated:

  • Input Sanitizers: JPEG compression, mean/median filtering preserve clean mAP but reduce ASR by less than 5 percentage points, leaving ASR >95%>95\%.
  • Model Retraining/Fine-tuning: Some reduction in misclassification ASR (−10-10 pp), but overall attack effectiveness remains.
  • Pruning/Fine-pruning: Drops ASR (down to ∼\sim9–13%) at the expense of catastrophic mAP collapse (from ∼\sim76 to ∼\sim27).

No currently available defense neutralizes AnywhereDoor without unacceptable loss in detection utility (Lu et al., 2024).

6. Implications and Future Directions

AnywhereDoor demonstrates that modern object detectors in critical applications—autonomous driving, surveillance, medical imaging—are susceptible to covert, highly-flexible, context-free backdoor attacks. Its architecture suggests the need for new defenses capable of verifying consistency across existence, localization, and classification hypotheses within object detectors.

Potential future extensions include:

  • Context-aware triggers: Triggers that activate preferentially based on scene configuration.
  • Physical-world resilience: Trigger patterns robust to viewpoint and illumination variation.
  • Defense development: Joint detection/sanitization pipelines, anomaly detection focusing on patch region self-similarity, and provable robustness for multi-task (existence, localization, label) settings (Lu et al., 2024, Lu et al., 9 Mar 2025).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to AnywhereDoor.