EAGER Design Principles

Updated 21 November 2025

EAGER design principles are a multidimensional framework defining five key pillars—enactment, acquisition, governance, exploration, and rigor—for building adaptive, robust systems.
They guide the design of agentic systems by integrating modular skill architectures, proactive stress-testing, and dynamic knowledge acquisition to enhance efficiency and resilience.
Empirical studies using these principles demonstrate improved system performance through formal ontologies, coordinated governance, and robust self-improvement protocols.

EAGER Design Principles define a multidimensional engineering framework aimed at producing adaptive, efficient, generalizable, explainable, and robust systems across both autonomous agents and AI-powered software analysis platforms. The term "EAGER" has been independently instantiated in technical literatures to structure foundational agentic capabilities (Abuelsaad et al., 17 Jul 2024), system-level resilience to the unexpected (Marron et al., 2020), and rigorous, LLM-driven software design assessment (Kolhatkar et al., 14 Aug 2025). Although their emphases differ, these sources unify around a set of five axiomatic pillars: (E) Enactment of behaviors, (A) Acquisition of knowledge, (G) Governance and collaboration, (E) Exploration and stress-testing, and (R) Rigorous foundations. Each pillar codifies specific principles and techniques for engineering systems that anticipate, adapt to, and remediate both expected and unprecedented conditions.

1. EAGER as a Foundational Agentic Framework

The EAGER schema integrates the key properties necessary for agentic and autonomous systems: Efficiency, Adaptability, Generalizability, Explainability, and Robustness (Abuelsaad et al., 17 Jul 2024 Marron et al., 2020). In multi-agent architectures, these properties dictate the compositional skill set, observation granularity, hierarchical organization, self-monitoring, and ongoing self-improvement workflows. For resilient autonomous system design, EAGER encapsulates the operational, epistemic, social, explorative, and ontological strata required for robust real-world deployment.

Pillar Definitions and Interlock

Pillar	Core Function	Example System Contexts
Enactment	Reactive/proactive skills, skill primitives	Robots in dynamic environments
Acquisition	Knowledge management, learning, ontology updates	LLMs, agent memory, AVs
Governance	Social coordination, collaboration	Multi-agent routing, HRI
Exploration	Proactive stress-testing, simulation, fuzzing	AV scenario simulation, code testing
Rigor	Formal ontologies, theoretical frameworks for "unexpectedness"	LLM critique, agent safety

These pillars form a mutually reinforcing framework, ensuring that no single property dominates at the expense of others (Marron et al., 2020).

2. Efficient and Adaptive Skill Architectures

In the context of agentic systems, efficiency and adaptability are primarily realized by exposing a minimal yet expressive set of domain-specific primitive skills (Abuelsaad et al., 17 Jul 2024). Each primitive skill $S: (state \times params) \rightarrow (state', obs)$ abstracts complex low-level functionality (e.g., open_url, click, enter_text) into robust, parameterizable actions. Agent-E, for example, leverages five primitives to encapsulate browser navigation, drastically reducing LLM call count and providing explicit, interpretable logs (Abuelsaad et al., 17 Jul 2024).

Adaptivity is further enhanced by modularly introducing new skills and tuning observation schemas. In the "WIP: Leveraging LLMs" pipeline, adaptability is managed using retrieval-augmented generation (RAG), where a dynamically indexed knowledge base of design principle definitions and canonical refactorings is embedded directly into the agent's prompts (Kolhatkar et al., 14 Aug 2025).

Efficiency is also reinforced via token-efficient input representations—e.g., flexible observation distillation, which selects from various DOM views (text_only, input_fields, all_fields) using utility versus token-size trade-offs (Abuelsaad et al., 17 Jul 2024).

3. Knowledge Acquisition, Self-Reflection, and Continual Improvement

Knowledge acquisition encompasses runtime learning, ontology management, and adaptive world modeling. Autonomous systems utilize both pre-loaded and continually updated knowledge sources: digital twins, ontological classes, sensor registries, and external data streams (e.g., weather feeds for AVs) (Marron et al., 2020). If an observed event cannot be mapped to any existing ontological class, the system flags it as "unexpected," triggering higher-level mitigation or fallback (Marron et al., 2020).

Agentic self-improvement in LLM-based systems involves online caching of prompt responses and offline mining of (task, plan, outcome) tuples for workflow refinement. Self-aware failure detection prompts human-in-the-loop demonstrations to extend the agent’s repertoire with new, explicitly-coded workflows (Abuelsaad et al., 17 Jul 2024). This closes the gap between model-based and classical rule-driven behavior, with empirical gains in both runtime performance and error avoidance.

EAGER principles require agents to function not as isolated automata but as socially-embedded entities—sharing intent, mimicking peer actions, negotiating, and requesting help (Marron et al., 2020). In multi-agent LLM systems, aggregation of independently detected violations (union over GPT-4, Claude, DeepSeek) raises coverage of code deficiencies above 75% while sustaining high precision (Kolhatkar et al., 14 Aug 2025).

Governance protocols include broadcasting state and intent, mimicking peers when local knowledge is insufficient, and participating in cooperative negotiation and error recovery. This leverages collective intelligence, mitigates local knowledge gaps, and provides a robust fallback during partial failures (Marron et al., 2020).

Notably, emergent system-level trade-offs arise: privacy risks and messaging overhead from continual broadcasting, and the challenge of verifying emergent coordination behaviors at design time.

5. Exploration and Proactive Stress-Testing

Deliberate exploration of the "expected" is essential for uncovering latent vulnerabilities. Techniques include simulation-based stress-testing, adversarial scenario injection, and red-team exercises (Marron et al., 2020). Web agents employ observation denoising and view selection to expose planning failures caused by overlarge or irrelevant inputs (Abuelsaad et al., 17 Jul 2024).

Mental modeling and environment-swapping—transposing behaviors or threat models from one context (factory floor) to another (hospital) or from rare to common contexts—systematically expands the tested operational envelope, surfacing specification gaps and unmodeled "long tail" phenomena.

Constraints include designer bias in enumerating test scenarios and the inherent limits of simulation fidelity.

6. Rigorous Foundations, Scoring Functions, and Ontological Reasoning

A formal theory of the unexpected is central to the EAGER ethos. Rigorous ontologies, property-centric classifications, and socio-technical modeling ground the empirical and operational aspects of EAGER in a principled way (Marron et al., 2020). For example, unclassifiable entities encountered in operation—such as a new physical object class—are handled by triggering fallback protocols rather than attempting dangerous action.

Scoring and evaluation functions further provide quantitative rigor. In LLM-driven code critique, standard information-retrieval metrics—precision, recall, F₁—are used to evaluate detection of design principle violations relative to a human-annotated "ground truth" (Kolhatkar et al., 14 Aug 2025). Empirically, structured prompt schemas and explicit principle enumeration boost both recall and precision.

7. Trade-Offs, Limitations, and Future Directions

The EAGER approach imposes significant implementation and verification burden:

Modular skills and observation denoising require ongoing hyperparameter tuning to balance brevity with coverage (Abuelsaad et al., 17 Jul 2024).
Social and governance mechanisms introduce privacy considerations and may result in non-deterministic, difficult-to-verify emergent behaviors (Marron et al., 2020).
Formal ontologies can lag domain evolution and require costly curation (Marron et al., 2020), while human-in-the-loop self-improvement incurs instructional overhead (Abuelsaad et al., 17 Jul 2024).
In automated code review, current knowledge bases are limited (≈10 principles) and out-of-context hallucinations persist, necessitating hybrid static analysis and expanded RAG corpora for broader domain coverage (Kolhatkar et al., 14 Aug 2025).

Future work outlined in these references includes integration with static analyzers, adaptive feedback personalization, IDE embedding for real-time interaction, longitudinal impact studies, and continual refinement of ontological and workflow repositories.

EAGER Design Principles thus provide a cohesive, theoretically grounded, and empirically validated foundation for designing agentic systems and automated analytic platforms to efficiently, adaptively, and robustly handle both the expected and the fundamentally unexpected in complex software and autonomous environments (Abuelsaad et al., 17 Jul 2024, Marron et al., 2020, Kolhatkar et al., 14 Aug 2025).