Multi-Stage Attack Path Simulation

Updated 20 December 2025

Multi-stage attack path simulation is a computational methodology that models interconnected cyber attack sequences using formal attack graphs and probabilistic exploit modeling.
It leverages attacker-defender game theory and optimization strategies to evaluate risks and improve defensive resource allocation.
Simulation algorithms generate dynamic attack data streams that enhance machine-learning-based intrusion detection and risk assessment across diverse networks.

Multi-stage attack path simulation is a rigorous computational methodology for modeling, generating, and evaluating sequences of interconnected cyber compromise steps traversed by adversaries in complex systems or networks. It integrates formal attack-graph or attack-tree representations, probabilistic or logical exploit modeling, defender actions (sensor placement, hardening), and experimental synthesis of dynamic time-series attack data streams for use in machine-learning-based intrusion detection and risk assessment. Architectures supporting these simulations span power-grid testbeds, enterprise networks, LLM-enabled enterprise workflows, and more.

1. Formal Modeling of Multi-Stage Attack Paths

The core abstraction in multi-stage simulation is the attack tree (or attack graph) $T = (N, E, r)$ , where:

$N$ is a set of compromise states or attack steps,
$E \subseteq N \times N$ encodes feasible transitions (edges) between steps,
$r \in N$ is the root (initial adversarial position).

Nodes may be OR-nodes (at least one child is sufficient for sub-goal satisfaction) or AND-nodes (all children must be compromised). A full multi-stage attack is a root-to-leaf path $p = \langle n_0=r, n_1,\dots,n_k\rangle$ . The global attack success predicate is:

$\Phi(T) = \bigvee_{p \in P(T)} \bigwedge_{n \in p} \text{cond}(n)$

where $P(T)$ is the set of all such paths from $r$ to attack goals (Sen et al., 2023).

Extensions for physical, protocol, and industrial layers (as with MulVAL) add further predicative rules—e.g., for ARP spoofing, DNS cache poisoning, SYN-flood DoS, Bluetooth PIN cracking, and bus-link deception (Stan et al., 2019).

Attack graphs are also interpreted as Bayesian networks for risk propagation (BAM) (François-Xavier et al., 2016) and as finite automata in kill-chain scenario reconstruction (Wilkens et al., 2021).

2. Attacker–Defender Interaction: Optimization and Game-Theoretic Models

Simulation frameworks explicitly model both adversarial and defensive strategies:

Attacker’s strategy space $S_A = P(T)$ comprises all feasible multi-stage attack paths.
Defender’s strategy space $S_D = \{ H \subseteq N \mid \text{cost}(H) \leq B \}$ covers all sensor placements or hardening sets given budget $B$ .

The attacker maximizes residual risk subject to reactive or preventive detection (sensor placement):

$u_A(s_A, s_D) = \text{Risk}(s_A) - \text{DetectionPenalty}(s_A, s_D)$

where, typically,

$\text{Risk}(s_A) = \sum_{j \in s_A} P_j \cdot C_j$

with $P_j$ as exploit probability and $C_j$ as outage cost. Detection penalties may be infinite (path abort) if any sensor fires (Sen et al., 2023). The attacker solves:

$\max_{s_A \in S_A} \min_{s_D \in S_D} u_A(s_A, s_D)$

Defender optimization includes Stackelberg or centrality-guided allocations, e.g. using Current-Flow Betweenness Centrality for IDS deployment (Sen et al., 2024).

Distributionally robust path-planning frameworks generalize this to games where arc costs are subject to moment and probability constraints, with non-anticipativity maintained across adaptive decisions (Ketkov, 2022).

3. Simulation Algorithms, Execution, and Data Synthesis

Attack path simulation proceeds via iterative, trial-based pseudocode loops. A typical game turn includes (see (Sen et al., 2023)):

Defender chooses sensor set $s_D$ subject to budget $B$ using learning rates $Q_j$ .
Attacker builds the round’s attack graph, computes edge weights $W_{i,j} = t_j \cdot C_j \cdot P_j$ ( $t_j$ is time-to-compromise).
Dijkstra’s algorithm yields minimum-weight path $p^*$ .
The attack is executed probabilistically, aborting on sensor detection.
Both attacker and defender update their skill, knowledge, and sensor allocation heuristics.
All steps and alerts are logged (e.g. Unified2 format).

Modular co-simulation environments (e.g., container networks for smart grids) use schedulers to synchronize power system, OT device, and network emulations, chaining attack modules using fact-based DAG planners (MITRE Caldera) (Sen et al., 2024). Large network platforms (Insight) use syscall-level simulators with real pentesting frameworks, pivot agents, and probabilistic exploit outcomes (Futoransky et al., 2010, Sarraute et al., 2010).

In both statistical and ML-driven IDS data generation, feature vectors include protocol, IP, timestamp, priority, and sub-protocol fields, labeled as attack or benign.

4. Metrics and Experimental Evaluation

Attack path and detection performance are evaluated via:

Path complexity (average CVSS scores, path length),
Attack success rates,
Detection rates, false positives/negatives,
Time-to-compromise (TTC) as $TTC(A) = \sum_j t_j$ or $TTC(s,W) = t_1P_1 + t_2(1-P_1)(1-u)$ ,
Expected damage $E[\text{Damage}] = \sum_i P_i C_i$ ,
ML classification metrics: accuracy, recall, precision, $F_1$ -score, AUC, Matthews correlation coefficient.

Evaluations in (Sen et al., 2023) show that increasing sensor coverage and defender budget lengthens attack paths and complexity, and improves ML-based IDS metrics (e.g., XGB achieving MCC ≈ 0.94).

In (Sen et al., 2024), time-to-compromise and protocol distribution are reported for both physical and virtual testbeds, and the synthetic framework matches real impact curves within 5%.

5. Advanced Modeling Extensions: Bayesian, Markov, Distributional, and Automata Perspectives

Attack graphs are extended to Bayesian networks (BAM), enabling dynamic risk propagation, sensor fusion, and path enumeration over polytree-structured graphs with cycles resolved by path-label expansion and bounded depth (François-Xavier et al., 2016). Conditional probability tables handle exploit, residual (zero-day), and sensor alert likelihoods, and inference is performed via belief propagation.

Markov chain models describe state transitions for each compromise step, estimating evolving compromise probabilities as $\pi^{(n+1)} = \pi^{(n)} \cdot T$ (Futoransky et al., 2010).

Distributionally robust multi-stage shortest path (DRSPP) models introduce moment-constrained ambiguity sets for cost modeling, with mixed-integer programming solutions and explicit non-anticipativity constraints (Ketkov, 2022). Adaptive multi-stage decisions yield 5–15% cost savings over static policies in synthetic studies.

Kill Chain State Machine (KCSM) approaches model attacks as state machines over network zones, synthesizing scenario graphs from time-ordered alerts by mapping transitions and aggregating infection graphs, with two–three orders of magnitude alert reduction for analyst triage (Wilkens et al., 2021).

6. Applications, Data Generation, and ML-Driven Intrusion Detection

Multi-stage simulation is used in cyber-physical grid environments, enterprise networks, LLM-based enterprise document security (Balashov et al., 21 Jul 2025), and smart grid co-simulation frameworks (Sen et al., 2021). Synthetic attack datasets generated from these simulations enable scalable ML-based anomaly and intrusion detection: tree-based, SVM, and outlier/density-based classifiers; F1 scores up to 92% for RF, MCC ≈ 0.933 for RF/XGB (Sen et al., 2023, Sen et al., 2021).

Training with data generated in full game-theoretic, multi-stage interplay regimes yields more generalizable classifiers than single-path or random path scenarios. Anomaly detection, prompt sanitization, and context isolation methods can mitigate multi-stage prompt inference attacks in LLM-enabled systems (Balashov et al., 21 Jul 2025).

7. Limitations, Tuning Guidelines, and Implementation Considerations

Parametric exploit probabilities and costs should be tuned to empirical CVSS/historical data, with noise and timing profiles calibrated to real network traces or pentest outcomes.
Model extensions for distributional ambiguity, sensor error rates, and non-anticipativity guarantee realistic simulation of unknown defenses and adaptive adversarial strategies.
Scaling to thousands of hosts requires techniques such as lazy syscall evaluation, copy-on-write file systems, and poly-logarithmic memory usage for asset graphs (Sarraute et al., 2010).
Synthetic dataset diversity is maximized by varying sensor counts $|s_D|$ , defender/attacker resource budgets, skill increments $\Delta \alpha$ , and path re-selection triggers.
For Bayesian models, path enumeration is bounded by a step limit to control combinatorial explosion (François-Xavier et al., 2016).

Empirical simulation results substantiate that multi-stage attack path simulation frameworks—when integrating attack graphs, game-theoretic defender models, and adaptive path planning—produce data and decision support with fidelity sufficient for benchmarking real-world ML/IDS systems, supporting detailed risk assessment and optimizing defensive resource allocation (Sen et al., 2023, Sen et al., 2024).