Agentic Military AI Governance Framework

Updated 2 July 2026

AMAGF is a comprehensive framework that enforces human control over autonomous military AI using layered technical controls and measurable metrics.
It utilizes design-time constraints, runtime mediation, and assurance feedback to align with international defense norms and safety standards.
The framework employs a Control Quality Score (CQS) to continuously monitor agent performance and manage risks related to autonomy and accountability.

Agentic Military AI Governance Framework (AMAGF) denotes an integrated, multi-level architecture for maintaining measurable, enforceable, and responsible human control over military agentic AI systems—those capable of autonomous planning, tool use, coordination, and real-time adaptation. Unlike legacy automation or single-turn AI, agentic military AI introduces unique governance problems: interpretive divergence, loss of corrigibility, epistemic drift, irreversibility, and cascading multi-agent failures. AMAGF addresses these through layered technical controls, continuous monitoring, quantitative measurement of human control, compositional authorization overlays, and institutional assignment of accountability. The following provides a technical and comprehensive account of the current state-of-the-art in AMAGF design, metrics, operationalization, and evaluation.

1. Foundational Governance Norms and Principles

AMAGF operationalizes established international norms, defense regulations, and safety standards by mapping them directly to executable controls and evidence mechanisms relevant for AI agents. Key source regimes and their primary governance demands include:

Laws of Armed Conflict (LOAC): Mandates distinction, proportionality, military necessity, and humanity in actions; e.g., combatant/civilian discrimination and bans on excessive harm.
Defense AI Ethics Guidelines (DoD, NATO): Accountability, traceability, operational reliability, human governability, non-bias.
UN and National Instruments: UN CCW Article 36 reviews; strict “human-in-the-loop” requirements for lethal force across NATO and national strategies.
AI Safety Standards: ISO/IEC 42001 (AI management), ISO/IEC 23894 (risk), NIST RMF 1.0, and NIST SP 800-160 Vol. 2 for cyber-resilience.

These regimes serve as input for AMAGF’s layered control translation, connecting high-level principles to mechanism- and context-specific enforcement (Koch, 6 Apr 2026, Davidovic, 7 Apr 2026, Sahoo, 3 Mar 2026).

2. Layered Technical Architecture and Controls

AMAGF employs a modular, four-layer control stack, each with distinct roles, governance assignments, mechanism classes, and feedback loops:

Governance Objectives: Formalizes normative intent, ownership, decision thresholds, and exceptions—e.g., “all lethal engagement above X% civilian-risk requires human approval.”
Design-Time Constraints: Restricts agent privileges (least-privilege principle), model scopes, exposure to tools, dataset selection, and sandboxing for proactive robustness.
Runtime Mediation (Guardrails): Action-time guardrails intervene prior to or during agent actions according to crisp, observable rules—e.g., denying “engage” if $P(\text{combatant})<0.95$ or if civilian presence flagged in context.
Assurance Feedback: Structured logs, immutable audit trails, incident metrics, and compliance dashboards support post-facto review, threshold tuning, and model versioning.

The following table illustrates the flow of LOAC-Distinction and Human Oversight requirements through these layers in a reconnaissance agent scenario (Koch, 6 Apr 2026):

Norm/Objectives	Design-Time	Runtime Mediation	Assurance Feedback
Combatant/civilian	Train on labeled	Deny “engage” if confidence < 0.95	Log, decision trace
Human-in-loop	Approval API stubs	Pause for officer, time-bounded	Store signed approvals

Each atomic control is specified as a tuple: $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ where $a$ is the actor; $x$ , the action; $r$ , the protected resource; $\phi$ , the activation condition; $\delta$ , the enforced decision; $\epsilon$ , the evidence artifact; $o$ , the accountable owner (Koch, 6 Apr 2026).

3. Formal Metrics and Runtime Enforcement

Operationalizing meaningful human control in the agentic context requires real-time, quantitative tracking. AMAGF’s technical innovation is the Control Quality Score (CQS), a continuous metric synthesizing six agentic control subdimensions—interpretive alignment, correction impact, epistemic alignment, irreversibility consumption, synchronization freshness, and swarm coherence: $\text{CQS}(t) = \min_{k=1}^{6} n_k(t)$ Each $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ 0 maps to a concrete agentic failure mode, such as interpretive drift (semantic distance between intended vs. actual agent interpretations), correction absorption, belief resistance, or commitment irreversibility. Escalation thresholds drive proportional system responses:

CQS Range	Action
> 0.8	Normal operation
0.6–0.8	Elevated monitoring, more probes
0.4–0.6	Restrictive (reversible actions)
0.2–0.4	Minimal autonomy, explicit auth
< 0.2	Safe state; autonomy disabled

Runtime enforceability is systematically classified via a rubric distinguishing rules by timing, observability, determinacy, reversibility, and evidence clarity (Koch, 6 Apr 2026, Sahoo, 3 Mar 2026).

4. Delegation, Scope Attenuation, and Compositional Authorization

Agentic military AI demands dynamic delegation and recursive scope management, formalized in AMAGF via compositional authorization overlays (Ibrahim et al., 2 Jun 2026). This regime introduces multiple delegation primitives—simple, scoped, conditional, depth-bounded, and temporal—enabling granular, context-specific, and audit-traceable transfer of operational authority. A core accountability relation: $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ 1 traces every action to its delegation chain, ensuring reconstructibility of command provenance.

Scope attenuation functions, $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ 2, rigidly confine agents to permitted operational envelopes, with overlays composed onto baseline access schemas by operator $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ 3: $\kappa = \langle a, x, r, \phi, \delta, \epsilon, o \rangle$ 4 Soundness guarantees that new delegation semantics—while enabling agent autonomy—never retract prior human authority except via explicit revocation.

5. Runtime Governance, Telemetry, and Containment (MI9 Integration)

AMAGF subsumes and extends the MI9 runtime governance architecture for agent oversight (Wang et al., 5 Aug 2025), aggregating:

Agency-Risk Index (ARI): Context-sensitive risk scoring to select governance intensity.
Agentic Telemetry Schema (ATS): Machine-readable real-time event logging, tracking cognitive and operational states.
Continuous Authorization Monitoring: Enforcement and revocation mediated by policy, agent goals, and delegation graphs.
FSM-based Conformance Engines: Multi-step pattern detection for policy compliance.
Goal-conditioned Drift Detection: Differentiation between benign adaptation and unauthorized drift using distributional divergence metrics.
Graduated Containment: Risk-tiered intervention from increased human touchpoints (monitoring) to total execution isolation.

Military instantiations require credential root-of-trust, multi-level security labels, fail-safe hardware interlocks, physical execution isolation, and red/black separation in classified telemetry flows (Wang et al., 5 Aug 2025).

6. Governance Failures, Escalation, and Institutional Accountability

Agentic military AI introduces six characteristic governance failures: interpretive divergence, correction absorption, belief resistance, commitment irreversibility, state divergence, and cascade severance within agent swarms (Sahoo, 3 Mar 2026). Mechanisms addressing these include:

Interpretive Alignment Testing and Epistemic Governance Architecture for proactive verification.
Irreversibility Budgeting and Synchronization Protocols to bound action effect scope and state drift.
Swarm Governance Architecture to preserve collective action within architectural containment in adversarial/failure regimes.
Post-Incident Governance Review (PIGR): Automated accountability assignment after crossing CQS degradation thresholds.

Responsibilities are explicitly divided among agent developers, procurement agencies, operational commanders, national regulators, and international treaty organizations, connecting technical to institutional accountability.

7. Limitations, Controversies, and Prospective Norm Development

Davidović (Davidovic, 7 Apr 2026) identifies deep structural incompatibilities between core agentic AI capabilities—initiative, interpretation, goal-directedness, dynamic memory—and prevailing human-control-centric legal norms, such as those articulated by GGE-CCW and REAIM. LLM-based agent features, by realigning both epistemic and normative authority, threaten genuine human agency in high-consequence operation (“kill chain” displacement). The literature details two governance paths:

Categorial Ban: Total prohibition of LLM-driven agents in certain battlefield contexts.
Human-Centric Oversight Regime: Embedding scenario-based TEVV, dual-engine (LLM/rule) architectures, tamper-evident provenance logging, explicit confidence gating, workload and explainability metrics, and regular third-party/cross-jurisdictional verification.

This suggests that formal, measured, and compositional technical architectures may not suffice to offset all norm spillovers—political and ethical contestation over agentic autonomy in lethal settings is ongoing (Davidovic, 7 Apr 2026, Sahoo, 3 Mar 2026).

AMAGF brings together layered technical controls, real-time quantitative safety metrics, dynamic compositional authorization, and institutional assignment of accountability to enforce continuous, auditable governance over agentic military AI. Its layered approach connects standards-derived objectives to granular, runtime-enforceable mechanisms, informs mission-specific configuration, and assigns recovery/oversight duties in response to emergent autonomy risks. The framework represents the current research locus for the alignment and safety of agentic military AI systems under adversarial, high-stakes operational realities.