Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution

Published 9 Apr 2026 in cs.RO | (2604.07833v2)

Abstract: Embodied agents are evolving from passive reasoning systems into active executors that interact with tools, robots, and physical environments. Once granted execution authority, the central challenge becomes how to keep actions governable at runtime. Existing approaches embed safety and recovery logic inside the agent loop, making execution control difficult to standardize, audit, and adapt. This paper argues that embodied intelligence requires not only stronger agents, but stronger runtime governance. We propose a framework for policy-constrained execution that separates agent cognition from execution oversight. Governance is externalized into a dedicated runtime layer performing policy checking, capability admission, execution monitoring, rollback handling, and human override. We formalize the control boundary among the embodied agent, Embodied Capability Modules (ECMs), and runtime governance layer, and validate through 1000 randomized simulation trials across three governance dimensions. Results show 96.2% interception of unauthorized actions, reduction of unsafe continuation from 100% to 22.2% under runtime drift, and 91.4% recovery success with full policy compliance, substantially outperforming all baselines (p<0.001). By reframing runtime governance as a first-class systems problem, this paper positions policy-constrained execution as a key design principle for embodied agent systems.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a runtime governance framework that separates agent cognition from execution, enabling clear auditability and adaptable safety controls.
It employs a modular architecture with dedicated components like Capability Admission, Policy Guard, and Recovery Manager to enforce dynamic policy constraints.
Empirical evaluations in simulation demonstrate significant improvements in unauthorized action interception, runtime violation detection, and recovery success rates.

Runtime Governance for Policy-Constrained Execution in Embodied Agents

Motivation and Problem Formulation

With the growing sophistication of embodied agents—systems capable of persistent action in physical environments, tool use, and long-horizon task execution—the challenge has shifted fundamentally. The central systems concern for such agents is no longer limited to enabling execution, but increasingly focuses on governing agent execution under explicit, environment- and context-dependent policies. Current approaches often conflate agent cognition and execution control, embedding safety or recovery logic directly into planner loops, policies, or controller code. This entanglement impairs standardization, auditability, transferability, and adaptability, especially across deployment contexts (simulation, real-world robots, human-shared spaces).

Contrary to common practice, the paper posits that embodied intelligence requires not only powerful agents but robust runtime governance. The defining research problem is: how can an embodied agent retain persistent and adaptive execution capabilities, but be subject to enforceable, observable, and recoverable policy constraints at runtime? The authors formulate this as a systems problem: constraining execution at the runtime layer rather than delegating all risk management to agent models, and explicitly separating agent cognition from execution governance.

Architectural Framework

The proposed framework introduces a three-entity architecture—embodied agent, modular capability packages, and a runtime governance layer—each with defined roles and interfaces. Agent cognition is responsible for goal interpretation, planning, and capability invocation proposal; capability packages define executable units annotated with preconditions, permissions, risk levels, and rollback metadata; the runtime governance layer assumes system-level execution authority, mediating every transition from intention to actuation based on current environment profiles and policy sets.

The runtime governance layer itself comprises six coordinated components:

Capability Admission: Performs admission control based on permissions, registration, and policy membership.
Policy Guard: Applies environment- and request-specific policy checks and can modify invocations into safe conformant forms.
Execution Watcher: Monitors live execution, detects anomalies, constraint violations, and runtime drift.
Recovery and Rollback Manager: Handles recovery in a policy-informed manner, explicitly managing retries, rollbacks, or human escalation.
Human Override Interface: Facilitates approval, intervention, and control transfer to human operators, parameterized by environment and policy.
Audit and Telemetry Layer: Logs all decisions and interventions for post-hoc analysis, audit, and compliance.

The pipeline is organized as a governed execution lifecycle: from goal interpretation, capability proposal, mediated admission and policy checking, governed execution launch, through runtime constraint monitoring, recovery/intervention on anomaly, to completion, audit, and replanning.

Policy-Constrained Execution Pipeline

Execution is formalized as a governed transformation $E_t = \mathcal{GOV}(\mathcal{P}_t, C_i, \Pi_t, \Gamma_t, \Omega_t)$ , where agent proposals, capability metadata, active policy sets, environment contexts, and runtime telemetry collectively gate and modulate embodied execution.

Policy constraints are environment-profile-dependent, supporting dynamic adaptation without re-writing agent behavior. Recoverability and auditability are treated as first-class properties, addressing the inherent risks and non-reversibility of physical actuation. Human authority is structurally encoded, enabling approval, override, and escalation as policy-governed—not ad hoc—controls.

Empirical Evaluation

Extensive simulation-based evaluation is conducted in Gazebo using a UR5e manipulator and a set of canonical navigation, manipulation, and composite tasks. The framework is benchmarked against direct execution (no governance), static-rule (pre-execution validation), and capability-internal safety baselines. Metrics include Unauthorized Action Interception Rate (UAIR), Runtime Violation Detection Rate (RVDR), Unsafe Continuation Rate (UCR), Recovery Success Rate (RSR), and Recovery Policy Compliance (RPC).

Key results:

UAIR: The proposed runtime governance framework yields an interception rate of $96.2\% \pm 2.7\%$ for unauthorized actions—significantly higher than static-rule and capability-internal baselines.
Runtime Violation Enforcement: Unsafe continuation is reduced from $100\%$ (baselines) to $22.2\% \pm 3.1\%$ under runtime drift, with strict policy compliance during intervention.
Recovery: The recovery success rate attains $91.4\% \pm 3.0\%$ , and policy compliance is $1.0$, outperforming all baselines ( $p < 0.001$ ).
Component Ablation: Removal of the Execution Watcher disables runtime detection; omitting the Recovery Manager drops recovery success to $28.1\%$ . The Human Override Interface, when active, blocks $100\%$ of unapproved high-risk requests, versus $65.8\%$ without it.

Governance-layer per-action latency is under $96.2\% \pm 2.7\%$ 0s at the $96.2\% \pm 2.7\%$ 1 percentile, introducing negligible overhead relative to control loop cycles.

Theoretical and Practical Implications

Formally disentangling agent cognition from runtime execution control creates clean boundaries, enabling modular policy design, environment profile portability, and improved system auditability. Agent, capability, and governance layers can now independently evolve, supporting future integration with learned anomaly detection, richer policy languages, or complex multi-agent deployments. Policies can be authored and validated externally, reflecting regulatory requirements (e.g., EU AI Act mandates for runtime monitoring and oversight) directly in operational code, improving deployment compliance.

Runtime governance provides a principled substrate for addressing the increasing risk gradient as agents operate in less constrained and more human-centric environments. The explicit treatment of recovery, rollback, and human oversight as structural pipeline components addresses longstanding gaps in embodied AI, where failures can propagate into unsafe physical states absent reactive governance.

Limitations and Future Work

Not all system modalities are amenable to externalized governance; reflexive servo-controllers and end-to-end visuomotor policies may require action-space, rather than capability-level, gating. False-negatives in violation detection (e.g., lower rates for human proximity violations in low-sensitivity environment profiles) highlight calibration trade-offs and motivate continued development of adaptive watcher sensitivity and environment-aware policy tuning.

Real-robot validation, multi-agent extensions, and integration with advanced policy authoring tools are identified as necessary future directions. Policy quality remains an upper bound on governance efficacy, and richer formal/compositional policy languages are needed for large-scale, heterogeneous deployments.

Conclusion

This work rigorously formalizes the systems challenge of governable embodied execution and presents a practical, modular runtime governance architecture. By externalizing and operationalizing runtime policy enforcement, monitoring, recovery, and human oversight, the framework establishes policy-constrained execution as a fundamental design principle for persistent embodied agent systems. The presented results show that execution-governance separation materially improves safety, auditability, and adaptability without incurring performance penalties. As deployments of embodied agents progress beyond demonstration to real-world, multi-context settings, runtime governance will become essential for robust, compliant, and trustworthy autonomous systems.

Reference:

"Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution" (2604.07833).

Markdown Report Issue