Automated Cyber Defense (ACD)

Updated 12 January 2026

ACD is a system of autonomous agents that continuously monitor, detect, assess, and respond to cyber threats using advanced algorithms and decision-making frameworks.
Reinforcement learning, graph-based modeling, and multi-agent coordination enable rapid threat mitigation, adaptive response, and scalable defense across dynamic network environments.
ACD frameworks balance security with operational continuity by integrating multi-objective optimization, risk assessment, and human-in-the-loop controls for resilient cyber defense.

Automated Cyber Defense (ACD) refers to systems of autonomous agents that continuously monitor, detect, assess, and respond to cyber threats in networked environments with minimal or no human involvement. These agents employ advanced algorithms—including reinforcement learning, graph-based reasoning, and multi-agent coordination—to enable real-time adversarial engagement, rapid mitigation, and adaptive response in the face of sophisticated and evolving attack techniques. ACD is foundational both for enterprise security and operational resilience in military, critical infrastructure, and large-scale cloud/IoT environments, where human-driven defense is infeasible due to the speed, scale, and complexity of cyber threats (Vyas et al., 2023).

1. Formal Foundations and Problem Structure

Automated Cyber Defense is conventionally formalized as a sequential decision-making process—most frequently as a (partially observable) Markov Decision Process (MDP or POMDP) or as a Markov Game for competitive, adversarial settings.

The canonical ACD MDP is defined as:

$\mathcal{M} = (\mathcal{S}, \mathcal{A}, P, R, \gamma)$

$\mathcal{S}$ : latent network states encoding host compromise, service status, topology, and process logs.
$\mathcal{A}$ : discrete or multi-discrete defender actions—e.g., isolate host, deploy decoy, restore system, modify access control, patch vulnerabilities.
$P(s'|s,a)$ : transition kernel reflecting network/service dynamics as well as adversarial evolution.
$R(s,a)$ : reward function balancing goals such as attack containment, service availability, and cost minimization.
$\gamma$ : discount factor ( $0<\gamma\leq 1$ ).

In adversarial settings, ACD is modeled as a zero-sum or general-sum Markov Game $G = (N, S, A, P, O, R, \gamma)$ , with policies $\pi_{D}(a_D|o_D)$ (defender) and $\pi_{A}(a_A|o_A)$ (attacker), and transition $P(s'|s,a_D,a_A)$ (Palmer et al., 31 Jan 2025, Dhir et al., 2021). Central challenges arise from enormous state and action spaces, adversarial adaptation, and partial/uncertain observability (Palmer et al., 2023).

2. Core Methods: Reinforcement Learning, Multi-Objective, and Multi-Agent Approaches

Single-Agent RL:

Modern ACD leverages Deep Reinforcement Learning (DRL)—notably Proximal Policy Optimization (PPO), Q-learning/DQN, and actor-critic variants—where agents estimate policies $\pi_\theta(a|s)$ or $\pi_\theta(a|o)$ via neural approximators (Nyberg et al., 2023, King et al., 19 Sep 2025). Reward functions typically penalize both attacker success (e.g., system compromise) and unnecessary disruption to services or users.

Graph-based Inductive Bias:

Representing the network as an attributed graph $G = (V,E,X)$ and employing Graph Neural Networks (GNNs) for state encoding confers permutation invariance and enables zero-shot transfer to unseen topologies. Actions are modeled as graph edits—parameterized by nodes/edges—facilitating scalable policies across varying network sizes (King et al., 19 Sep 2025).

Planning and Search:

Tree-search frameworks using Monte Carlo Tree Search (MCTS) guided by learned GNN priors (e.g., ACDZero) explicitly address long-horizon reasoning and state-exploration in large-scale or highly dynamic environments. Policy distillation from MCTS rollouts enables fast, reactive actors for deployment (Li et al., 5 Jan 2026).

Multi-Objective RL (MORL):

ACD is fundamentally multi-objective—balancing security, resilience, service continuity, and operational cost. Multi-objective formulations employ vector-valued rewards $\mathbf{r}(s,a)$ , where agents are trained via:

Scalarization (for single-policy MORL, e.g., MOPPO) sweeping user-set weights $w$ over reward components to trace Pareto fronts.
Preference-conditioned networks (PCN, multi-policy MORL), where policies take a target return vector as input and aim to realize arbitrary preference points at inference (O'Driscoll et al., 2024). Empirically, MORL yields more flexible and context-sensitive ACD agents than single-objective RL, allowing adaptation to shifting mission requirements.

Multi-Agent RL:

Realistic cyber defense involves distributed agents operating under partial observability. Multi-Agent Deep RL frameworks (e.g., MAPPO, MAAC) employ centralized training with decentralized execution (CTDE): joint critics see global state during training; each policy acts locally at deployment (Wang et al., 2024, Landolt et al., 26 May 2025). Agent coordination can be explicit (parameter sharing, action masking) or implicit (shared rewards, communication protocols).

RL Paradigm	Key Strengths	Current Limitations
Single-Agent PPO/GNN	High sample efficiency, generalization via graphs	Scalability to very large networks
MCTS/GNN Hybrid	Multi-step planning, reduced overfitting	High computational cost
Multi-Agent (CTDE)	Coordination, partial observations	Linear critic scaling, sample inefficiency
Multi-Objective (MOPPO)	Tunable Pareto trade-offs, adaptive inference	Exponential scaling in objectives
LLM-based Agents	Explainable decision traces, prompt-based adaptability	Inferior raw performance, slow inference

3. Environment Modeling, Attack Graphs, and Evaluation

Attack Graphs:

Network structure, attacker-progress, and defensive operations are commonly encoded as attack-defense graphs, often instantiating formalisms such as Meta Attack Language (MAL) (Nyberg et al., 2023). AND/OR nodes, edges for dependencies and defense intervention, and time-to-compromise distributions enable simulation of realistic multi-stage campaigns.

Evaluation platforms: Environments (often called "cyber gyms") such as CybORG, CyberBattleSim, and NASimEmu provide discrete-event or hybrid simulation of host/service networks, support multi-agent RL, and facilitate benchmarking across scenarios, attacker types, and dynamic topologies (Vyas et al., 2023, Landolt et al., 26 May 2025).

Metrics:

Cumulative episodic reward
Mean time to detect/contain/restore (MTTD/MTTC/MTTR)
Percentage of critical assets compromised
Service downtime/availability
Robustness to adversarial adaptation (exploitability)
Zero-shot transfer/generalization to new topologies

Empirical results provide that RL-based ACD agents outperform heuristic baselines under sensor noise, heterogeneous attacker strategies, and, when equipped with relational inductive bias (GNNs), exhibit robust generalization to previously unseen network graphs (Nyberg et al., 2023, King et al., 19 Sep 2025, Li et al., 5 Jan 2026).

4. Real-World Challenges: Risk, Trust, Constraints, and Human Oversight

Risk of Unintended Harm:

Autonomous agents can induce negative externalities—functional, safety, security, ethical, or moral—if defensive actions disrupt mission services or result in collateral damage. Quantitative risk assessment therefore becomes integral:

$R(a) = P\bigl\{\text{externality}\mid a\bigr\}\cdot C(\text{externality})$

Policies are evaluated under operational distributions:

$R(\pi) = \mathbb{E}_{s\sim d^\pi,\,a\sim\pi(\cdot|s)}[c(s,a)]$

where $c(s,a)$ estimates context-dependent externality cost (Ligo et al., 2022).

Safety and Trust:

Best practices enforce hard constraints at run-time (e.g., "never disconnect hospital subnets during patient care"), hierarchical feedback loops (periodic human review of action logs), and overall trust indices quantifying the fraction of compliant actions (Ligo et al., 2022).

Human-in-the-Loop Integration:

Deployed ACD agents typically begin in advisory mode, with humans retaining override and approval roles, until confidence metrics permit increasing autonomy. Explainability (especially in critical operations) is a noted gap for deep RL and a current area of development for LLM-based agents, which can generate natural-language justifications but have not yet matched RL agents in reliability or speed (Castro et al., 7 May 2025).

5. Generalization, Robustness, and Transfer Learning

Zero-Shot Transfer and Topology Adaptation:

Traditional "vectorized" RL agents overfit to fixed network layouts. Graph-based policies—especially those employing permutation-invariant GNNs and optimal-transport–regularized latent spaces—support zero-shot adaptation: a single trained policy can defend new topologies, node insertions, or reindexed graphs with minimal (or no) retraining (Ramamurthy et al., 28 Jun 2025, King et al., 19 Sep 2025, Li et al., 5 Jan 2026).

Multi-Adversary and Type-Agnostic Learning:

Effective ACD agents must confront diverse attacker TTPs. Bayesian game-theoretic frameworks model defenders maintaining priors/beliefs over attacker types, updating online via Bayesian inference, and employing multi-type self-play to avoid overfitting. Policy architectures such as HiPPO demonstrate scalability for robust performance against compound threat models (Galinkin et al., 2024).

Robustness:

Robust RL and adversarial training—formulated as minimax optimization across policy and environment/adversary distributions—yield agents capable of sustaining performance under non-stationary threat models or synthetic disturbance injection (Palmer et al., 2023, Palmer et al., 31 Jan 2025).

Continual and Transfer Learning:

Current research is directed towards meta-RL, transfer paradigms, and sim-to-real domain adaptation to ensure deployed ACD agents adapt to emergent vulnerabilities, live operational data, and shifting mission constraints (Vyas et al., 2023).

6. Architectures, Life-Cycle Integration, and Future Directions

Modular and Multi-Agent Architectures:

Practices are evolving from monolithic ACD agents to modular, multi-agent constellations aligned to the cyber defense life cycle (e.g., specialized agents for detection, response, and recovery). This compositional approach enables tractable training, generalization, and plug-and-play integration with human-driven Security Operations Centers (Oesch et al., 2024, Kott, 2018).

Swarm/Distributed Models:

Military and critical-infrastructure deployments focus on Autonomous Intelligent Cyber-Defense Agents (AICAs) capable of self-healing, local learning, stealth, and peer-to-peer negotiation (e.g., in contested or intermittently connected environments). These architectures stress modular layering, goal-driven planning, stealth/self-assurance, and secure communications (Théron et al., 2019).

Key Challenges and Research Frontiers:

Certifiable safety and assurance for learning-enabled policies and self-modifying agents.
Massive scalability in state/action space and the number of agents.
Integration of formal verification and explainable techniques.
Human-agent teaming and adaptive autonomy levels.
Defense against adversarially crafted attacks on both observations and models.
Legal, ethical, and operational frameworks for autonomous action and negative externality management.

Table: Major ACD Architectural Themes

Theme	Core Components/Features
Sequential RL w/ GNN	MDP/POMDP, attributed graphs, permutation-invariance
Multi-Agent Coordination	CTDE, decentralized local policy, distributed negotiation
Risk-Constrained Design	Real-time constraint checks, human-in-loop feedback
Modular/Life-Cycle Agents	Specialized roles aligned to defense stages
Game-Theoretic Robustness	Double Oracle, min-max RL, Bayesian attacker modeling
Hybrid Search+RL	MCTS/Tree-Search, policy distillation
LLM-based Reasoning	Prompt-based explainability, hybrid RL-LLM teams

ACD remains an area of rapid methodological advancement, characterized by a confluence of reinforcement learning, graph-theoretic modeling, multi-objective and multi-agent reasoning, formal risk-constrained optimization, and principled architectures for modular, resilient, and explainable defense systems (Li et al., 5 Jan 2026, King et al., 19 Sep 2025, O'Driscoll et al., 2024, Castro et al., 7 May 2025, Ramamurthy et al., 28 Jun 2025). Open technical and governance challenges motivate continued research across AI theory, systems engineering, and applied cyber operations.