- The paper introduces a tool-mediated LLM architecture that formally guarantees closed-loop controllability and stability even under adversarial conditions.
- It employs deterministic primitives such as Bayesian belief updates and attack-graph computations to mediate LLM reasoning, ensuring predictable and bounded outcomes.
- Empirical validation across enterprise attack graphs demonstrates 100% monotonicity, ISS robustness, and significant reduction in defender uncertainty.
The paper "Stable Agentic Control: Tool-Mediated LLM Architecture for Autonomous Cyber Defense" (2605.03034) introduces a formal architecture for closed-loop, LLM-driven cyber defense under adversarial pressure with machine-checked stability guarantees. The central contribution is an agentic system where LLMs act as controllers for enterprise cybersecurity operations, mediated by deterministic tools that enforce catalog constraints on both defender and attacker action spaces. In this architecture, the LLM agent leverages (but does not directly execute) deterministic primitives—including Stackelberg best-response solvers, Bayesian belief-updating observers, and attack-graph computations—to select actions from finite, catalog-enforced sets.
This tool mediation fundamentally alters the agentic loop. The architecture explicitly separates stochastic LLM reasoning from plant transitions and game dynamics, allowing the system to bound non-determinism, control catalog exhaustiveness, and render the stability of outcomes a property of the system structure—independent of agent implementation details or LLM backbone.
At the theoretical core, the paper introduces a composite Lyapunov function V(k)=S(k)+λθ(k), where S(k) is the attacker's expected game value (network interdiction payoff reflecting maximum attacker benefit after defense) and θ(k) is the mean uncertainty over defender belief about attack graph edges. This Lyapunov structure provides the vehicle for formal verification of closed-loop controllability, ISS robustness against adaptive attacks, and observer convergence.
All proofs are machine-checked in Lean 4 with zero “sorry,” instantiating strong guarantees:
- Controllability: Under no adversarial disturbance, every defender policy deployment yields a monotone decrease in S(k) and contracts the Lyapunov function.
- Input-to-State Stability (ISS): Under best-responding, intelligent adversarial disturbance, V(k) remains ISS-bounded; defense actions and Bayesian updates compensate for any attacker graph expansion within finite catalogs, with margins directly depending on catalog parameters, budget, and observer contraction rate.
- Observability: The estimator contracts edge-level uncertainties geometrically as evidence accumulates, ensuring practical system identification even under asymmetric and partial observation scenarios.
- Generalization: These guarantees hold controller- and adversary-agnostically—any agentic controller and adversary confined to their respective catalogs inherit the same formal system-level bounds.
Empirical Validation on Real-World Cyber Graphs
Experimental results operationalize the formal framework across two axes:
1. Architectural Properties (Claims i–iii) on 282 Enterprise Attack Graphs
Evaluation on 282 industry-scale attack graphs, derived from real pentesting data spanning 161 enterprise organizations and 25 sectors, demonstrates:
2. Generality with LLM Controllers (Corollaries 1–2) on Paired Telemetry
On the GOAD Active Directory environment, the system was instantiated with Claude Sonnet~4 and Claude Haiku~4.5 LLMs as controllers:
Technical and Operational Implications
This work makes several important claims, each demonstrated with statistical rigor and empirical completeness:
- System-level rather than agent-level guarantees: All stability and observability properties are enforced by the closed-loop structure and actuator interface; they do not depend on the reasoning logic or backbone capability of the LLM. This shifts the unit of safety from component-level (agent) to architecture-level (system).
- Non-determinism is strictly contained: LLM stochasticity is leveraged for search and exploration but cannot violate catalog boundaries or destabilize outcomes; outcome variance observed in vanilla LLM agents is absent under this architecture.
- Adversarial pressure is informative: Paradoxically, adversarial expansion accelerates defender estimation convergence, functioning as an implicit informant by triggering observations otherwise inaccessible.
- Certification is dual-use: The same formal bounds apply to attackers as to defenders; a malevolent agent confined to the same actionable catalog cannot destabilize the system or evade the disturbance envelope.
- Reasoning depth and integration: Empirical evidence (Sonnet vs. Haiku) shows that architectural safety (no off-catalog action, ISS bounds) does not equate to optimality; poor integration of observer evidence (as in Haiku) can yield suboptimal system-level defense, motivating runtime monitoring of the belief-truth gap as a diagnostic.
- Scalability and reproducibility: All guarantees and results hold without training or adaptation—convergence occurs in a single analysis cycle, and the Lean 4 verification is reproducible.
Future Directions
The authors identify several extension points of significant research interest:
- Relaxing action monotonicity and rollback: Current (A4) persistence assumptions can be weakened to admit more dynamic catalogs or reversible actions, broadening applicability to settings with rolling policy windows.
- Dynamic catalog expansion and open-world action sets: Extending the formal ISS and controllability bounds to allow for incremental catalog augmentation under systematic validation procedures.
- Generalization domains: The formalism applies wherever agentic systems operate under adversarial pressure and catalog-bounded actions (e.g., financial compliance, physical security, safety-critical robotics).
Conclusion
The paper presents a complete architectural and verification framework for LLM-mediated cyber defense, elevating system-level safety and robustness above agent-specific guarantees. The Lyapunov-based, Lean 4-checked approach ensures that closed-loop controllability, ISS robustness, and observability are provably maintained even when agents are non-deterministic, adversarial, or of variable reasoning capability. The empirical margin observed across real-world security graphs, with precise statistical treatment, supports the architecture’s deployment in operational environments demanding auditable guarantee envelopes. The approach generalizes to any setting with finite-catalog, tool-mediated agentic control under adversarial disturbance, marking a paradigm shift in certifiable agentic system design.