Threat Taxonomy: Multi-Layered Attack Surfaces

Updated 20 January 2026

Multi-layered attack surfaces are stratified points in ML pipelines where adversaries exploit vulnerabilities in raw data, features, and models.
The taxonomy categorizes attack strategies by specificity, scope, and optimization across diverse system components.
Empirical findings highlight detectable fingerprints from composite attacks and underscore the need for holistic, adaptive defense methods.

Multi-layered attack surfaces in adversarial ML refer to the diverse, stratified points throughout a complex ML pipeline where adversaries may intervene to subvert, mislead, or compromise system behavior. Such a taxonomy encompasses both classical evasion and poisoning attacks against inference/training data, as well as more specialized threats targeting feature engineering, domain-specific representations, model architectures, and ancillary system components. The following entry provides a rigorous, technically detailed synthesis of the structure, operational characteristics, empirical findings, and ramifications of multi-layered adversarial attack surfaces, as documented in recent research.

1. Formalization of Attack Surfaces and Threat Models

A multi-layered attack surface in ML systems is characterized by the set of all components, interfaces, and representations an adversary could exploit. This spans several abstraction levels:

Raw Data and Preprocessing: Perturbing raw sensor readings, text tokens, or application event logs prior to feature extraction.
Feature Engineering / Representation: Manipulating engineered feature vectors (e.g., gradient-based or explainability-informed feature selection) without direct semantic modification of the underlying input.
Model Architecture and Parameters: Bit-level tampering, weight-level Trojan/backdoor insertion, or configuration-based model attacks (affecting model selection, ensembling) during or after deployment.
Inference Pipeline and Post-processing: Interfering with downstream ML4VIS pipelines, intelligent controllers (e.g., RIC/xApps), or interaction with shared databases and APIs.
System Integration Points: Exploiting ML4VIS-specific or domain-specific stages: data-visualization mapping, insight communication, RAN control loops, medical device interoperability, etc.

Each layer admits different attacker knowledge levels and constraints:

White-box access: Full internal visibility (parameters, gradients, model code).
Gray-box access: Partial knowledge (feature schema, architecture type, output distributions).
Black-box access: Query-only or API access, possibly inferred via transfer/substitution (Wu et al., 2023).

The general attack optimization problem can be formalized as: $\min_{x^\prime,\,w^\prime} \; S(x, x^\prime; w, w^\prime) + C(x, y; w, w^\prime) + I(x^\prime, y^\prime; w, w^\prime)$ where $S$ encodes stealthiness constraints on input or weights, $C$ enforces benign consistency on clean data, and $I$ enforces adversarial inconsistency (targeted misbehavior) (Wu et al., 2023).

2. Taxonomy of Multi-layered Attacks: Structural and Methodological Classes

Assion et al. define a modular "attack generator" framework, decomposing every adversarial attack into the following blocks (Assion et al., 2019):

Specificity: The semantic or operational goal (untargeted, static/dynamic target, confusion/removal).
Scope: Input domain affected (individual example, contextual group, universal—works across the entire distribution).
Imperceptibility: Norm-based or functional constraints (e.g., $L^p$ bounds, total variation, semantic similarity).
Optimization Procedure: Gradient- or query-based search, evolutionary, combinatorial, or bilevel approaches.

Table: Layered Attack Instantiations

Surface	Attack Type	Representative Methods / Tools
Raw Input	Evasion, backdoor, UAP	FGSM, PGD, DeepFool, TextFooler, adversarial stickers
Feature Space	Sparse/targeted evasion	XAI-guided attacks, ℓ₀-minimization (Awal et al., 2024)
Model Weights	Bit flips, bias, Trojan	SBA, GDA, bit-flip, subnet-replacement (Wu et al., 2023)
System/APIs	Data/DB-layer perturb	O-RAN xApp poisoning, controller RIC malware (Sapavath et al., 2023)
Downstream	Visualization pipeline	DR manipulation, chart rec. shifting (Fujiwara et al., 2024)

This taxonomy enables systematic coverage of classic, domain-specific, and compositional adversarial phenomena.

3. Empirical Evidence of Layer Interactions and Attributable Fingerprints

Recent attribution research demonstrates that adversarial examples often encode fingerprints revealing their generation process (Dotter et al., 2021). Specifically:

A classifier can distinguish between different attack algorithms (e.g., FGSM, DeepFool, C&W) and between architectures (AlexNet, VGG16, ResNet50) with accuracies up to ≈0.99 on MNIST (for norm-type attribution), and 0.78 on CIFAR-10 (attack-type attribution).
Some hyperparameter choices (e.g., $\epsilon$ in FGSM) leave separable traces, while others (DeepFool overshoot) do not, indicating that some layers/features are more or less leaky.
This indicates "vertical" structure in the attack surface: even minimal attacks (>0 in $\|\delta\|_p$ ) can encode information that traverses system layers, defeating attacker anonymity.
The implication is that multi-layered defense and forensics must jointly analyze raw data, feature perturbations, and model-internal states to attribute attack provenance.

4. Domain-specific and Composite (Cascading) Attack Pathways

Multi-layered surfaces manifest in application domains through complex adversarial chains:

ML4VIS (Visualization Pipelines): Adversaries manipulate raw attributes (one-attribute or substitute-model attacks), which then propagate through DR (e.g., PUMAP), chart recommendation (MultiVision), and cause misleading visualization outputs. Cascading attacks that combine black-box input perturbation with white-box chart-gradient manipulation can flip recommended charts or mislead analysts. Multi-stage pipelines are especially vulnerable due to interdependencies (Fujiwara et al., 2024).
O-RAN and xApps: Adversarial manipulation of shared RAN telemetry databases via containerized xApps leads to erroneous interference classification and severe (30–50%) loss in network capacity and accuracy (Sapavath et al., 2023).
IoT Fingerprinting: Attacks span context (thermal manipulation), engineered features, and LSTM–CNN model weights. Iterative evasion attacks can exploit layerwise vulnerabilities even in temporal-input models (Sánchez et al., 2022).
Malware Detection: Feature-space evasion, problem-space attacks (code transplantation), and weight/parameter attacks (bit-flips or Trojans) have been demonstrated to fully subvert SVMs and robustified detectors (Cortellazzi et al., 2019, He et al., 23 Jan 2025, Rashid et al., 2023, Rashid et al., 2022).

Composite scenarios arise when attackers combine weak perturbations at multiple layers (e.g., code tokens + semantic features, or input attributes + visualization outputs) to maximize stealth and bypass single-stage defenses (Fujiwara et al., 2024, Awal et al., 2024).

5. Implications for Defense, Forensics, and Evaluation

Effective defense and evaluation of ML against multi-layered adversarial threats requires:

Multi-layered Detection and Response: Attribution classifiers should be trained across both input/feature and model/output spaces to maximize discriminative power (Dotter et al., 2021).
Strategic Ensembles and Moving Target Defenses: Dynamic selection and randomization across a diverse portfolio of models can raise attacker costs and reduce transferability, but are not foolproof—attackers may fingerprint model-drift schedules or probe for static behavior (Rashid et al., 2022, Rashid et al., 2023).
Adversarial Training Beyond Input Perturbations: Robustness must be instilled not only for $\ell_p$ -bounded examples but also against problem-space and feature-space manipulations, as well as weight/bit-flip alterations (Cortellazzi et al., 2019, Sánchez et al., 2022).
Certified and System-Aware Defenses: Provable robustness (randomized smoothing, IBP) is necessary but remains limited to specific perturbation models—adversaries exploiting multiple layers simultaneously challenge these guarantees (Jha, 8 Feb 2025, Sánchez et al., 2022).
Vulnerability Disclosure and Forensic Monitoring: Public exposure of known system/process weaknesses, as well as continuous monitoring for distributional drift or abnormal feature correlations, are essential for operational resilience (Dotter et al., 2021, Fujiwara et al., 2024).

6. Open Problems and Future Research Directions

The complexity and compositionality of multi-layered attack surfaces in ML systems drive several key research challenges:

Cross-domain and Cross-layer Generalization: Formalizing attack/defense interactions across input, feature, model, and system-integration layers; building threat models that account for composition effects (Wu et al., 2023, Pauling et al., 2022).
Chained and Hybrid Compression-induced Vulnerabilities: Studying how model compression (pruning, quantization, distillation) interacts with evasion and poisoning attacks across deployment layers, especially in IoT/embedded contexts (Westbrook et al., 2023).
Adaptive and Cascading Defense Strategies: Developing adaptive, multi-stage adversarial training and detection that responds to compositional and evolving attack patterns, possibly integrating differential attribution analysis and continual learning (Jha, 8 Feb 2025, Dotter et al., 2021).
Unified Certification and Evaluation Frameworks: Designing evaluation protocols and repositories (e.g., ARES-bench) that test multi-layered robustness, with real-world constraints and layered threat models (Dong et al., 2021).

A plausible implication is that advancing adversarial robustness in practice will require a shift from single-layer, static, and norm-bounded defenses to dynamic, multi-layered frameworks that structurally integrate attribution, diversity, and system-level awareness.

7. Summary

Multi-layered attack surfaces in adversarial ML are defined by the stratification and interactions of exploitable system components—from raw data to model weights to application-specific downstream processes. Recent research demonstrates that adversarial manipulation at any layer can leave detectable signatures across the stack, that composite and cascading attacks are practical and effective in complex systems, and that defense requires equally stratified, adaptive, and forensic-aware approaches. The field’s current open problems center on holistic modeling, principled evaluation, and robustification of the full ML life-cycle, including but not limited to inference, training, and deployment stages (Wu et al., 2023, Dotter et al., 2021, Sapavath et al., 2023, Fujiwara et al., 2024).