- The paper presents AEROS, a unified single-agent robotic architecture that decouples persistent cognition from modular embodied capability modules for robust and safe operation.
- Empirical evaluations demonstrate 100% task success on multiple manipulation and navigation tasks, highlighting superior performance under high skill failure rates compared to baselines.
- The architecture’s modular design enables dynamic ECM hot-swapping and strict policy enforcement, ensuring scalable extensibility and operational safety.
AEROS: Unified Single-Agent Robotic Architecture with Embodied Capability Modules
Architectural Motivation and Design
AEROS addresses a key deficiency in contemporary robotic systems: the absence of a principled, unified abstraction for intelligence, capability extension, and safe execution. Current paradigms either entangle skills and control logic in monolithic frameworks, impeding extensibility, or rely on loosely coordinated modules or agent collections, often at the cost of fragmented identity, inconsistent authority, and tenuous safety guarantees. The central thesis of AEROS is the Single-Agent Robot Principle—each robot is realized as a persistent agent, the unique locus of identity, memory, and decision authority, with abilities extended exclusively through modular Embodied Capability Modules (ECMs) rather than by proliferating agent entities.
AEROS is instantiated in a three-layer architecture: (1) a persistent agent layer maintaining all cognitive state and orchestrating execution, (2) an ECM layer providing modular, installable skill, tool, and model packages, and (3) a runtime layer decoupling policy enforcement (including safety and permissions) from capability implementation. This is formalized in the system tuple R=(A,E,Π), where A is the agent, E the set of installed ECMs, and Π the active policy configuration.


Figure 1: AEROS enables a single persistent agent to operate across simulated and physical platforms via installable ECMs.
This architectural separation confers several critical properties: modular extensibility; consistent, system-wide safety and resource enforcement; composable closed-loop execution; and dynamic capability management without recourse to multi-agent complexity or cross-cutting control roles.
ECM Abstraction and Runtime Policy Separation
Each ECM packages atomic skills, supporting models and tools, explicit capabilities, and associated metadata (including resource requirements and declarative permissions). ECMs are strictly passive: they provide executable endpoints invoked by the agent, never instantiating independent decision processes, thus preserving single-agent integrity. ECMs follow an explicit lifecycle (install, configure, activate, deactivate, remove), and can be hot-swapped at runtime without system downtime—a functionality empirically validated in the study.
The runtime policy engine forms a hard boundary enforcing execution constraints, resource quotas, and sensor/actuator access, independently from both agent logic and ECM implementation. Policies may be reconfigured dynamically, allowing the same ECM to operate in heterogeneous safety envelopes. This policy-logic separation enables zero false-acceptance blocking of invalid actions, as demonstrated over 1,800 permission checks in simulation with deterministically complete outcomes.
Empirical Evaluation
AEROS is comprehensively validated in PyBullet simulation using a Franka Panda 7-DOF manipulator across eight experiments: dynamic re-planning, failure recovery, policy enforcement, comparison with BehaviorTree.CPP- and ProgPrompt-style baselines, generality across manipulation and navigation tasks, ECM hot-swapping, ablations of re-planning/policy/recovery subsystems, and robustness under extreme stochastic failure rates.
The headline results are:
Moreover, ECMs can be loaded “hot”—new capabilities are immediately recognized and utilized by the persistent agent with 100% post-swap success, confirming the feasibility of deployment-time extensibility.
Comparative and Ablation Analyses
The direct comparison with established programming models is particularly informative. Both BehaviorTree.CPP and ProgPrompt-style LLM task planners embed fallback, retry, or re-planning mechanisms, but lack the unified single-agent abstraction and strict runtime policy separation of AEROS. Only AEROS delivers total robustness: its advantage over BT.CPP is statistically significant (p<0.05 on all tasks), and this is achieved without trade-off in wall-clock planning cost.
Ablation studies confirm that dynamic re-planning and recovery mechanisms are responsible for the observed performance edge (−5.7pp and −8.3pp mean reduction when individually removed), whereas the policy layer mainly serves as a safety border under adversarial or malformed skill/ECM scenarios, incurring negligible execution time overhead (<0.01ms per check).
Implications and Prospects
AEROS provides a strong architectural foundation for robotic systems demanding extensibility, safety, composability, and operational coherence. By decoupling persistent agent cognition and state from modular capability extension, and by separating strict, runtime-enforced policies from skill logic, it offers a robust path toward constructing embodied intelligence platforms supporting continuous capability evolution and principled, scalable skill ecosystems.
The practical implications include: streamlined deployment and maintenance of robotic capability upgrades, enhanced system verification via strict policy checking, reliable on-the-fly task switching under uncertainty, and reduced debugging complexity through a unified locus of control. Theoretically, AEROS’s architecture opens research directions in safe embodied autonomy, large-scale skill market development via ECM schema standardization, and formal methods for persistent agent verification.
Next developmental steps must include: validating the architecture on physical robots beyond simulation, benchmarking scalability with large ECM/task sets, longitudinal studies of persistent agent memory and experience aggregation, and integrating variable-latency planners (e.g., LLMs) subject to real-time constraints.
Conclusion
AEROS delivers a formal single-agent operating architecture for robotics with embodied modular extension and strict runtime policy enforcement. Empirical results demonstrate decisive advantages in robustness, extensibility, and compositional safety over prominent baselines, with clear operational and conceptual benefits. This work provides a compelling blueprint for future embodied AI systems, supporting scalable, principled growth and compositional reasoning in complex, dynamic environments (2604.07039).