Autogenesis Protocol (AGP)
- Autogenesis Protocol (AGP) is a self-evolving agent framework that decouples resource evolution from optimization using standardized, auditable interfaces.
- It comprises two layers: RSPL, which registers and manages agent components, and SEPL, which orchestrates closed-loop evolution via formal operator algebra.
- Empirical evaluations with the Autogenesis System (AGS) show significant improvements in long-horizon planning and tool usage, validating its modular update and rollback mechanisms.
The Autogenesis Protocol (AGP) is a self-evolution agent protocol designed to address the deficiencies of prior connectivity-centric agent frameworks by introducing standardized, auditable interfaces for resource lifecycle, versioning, and closed-loop self-modification. AGP enables modular, safe evolution of agent components—prompts, agents, tools, environments, and memory—by decoupling the logic of optimization from the resources undergoing change and formalizing their interactions via operator algebra over versioned registries. This protocol underpins the Autogenesis System (AGS), a multi-agent architecture empirically validated on long-horizon planning and heterogeneous tool-use tasks, demonstrating improved robustness and flexibility relative to baselines. AGP is structured in two layers: the Resource Substrate Protocol Layer (RSPL), which abstracts agent components as protocol-registered resources, and the Self-Evolution Protocol Layer (SEPL), which orchestrates the closed-loop evolution with explicit lineage and rollback mechanisms (Zhang, 16 Apr 2026).
1. Motivation and Conceptual Foundations
AGP was developed in response to critical shortcomings in existing agent protocols such as Google’s A2A and Anthropic’s MCP. These protocols primarily specify connectivity—model-to-tool or inter-agent invocation—while neglecting explicit management of component state, lifecycle (creation, update, deprecation), versioning, and safe interfaces for self-modification. The absence of structured lifecycle and version primitives encourages monolithic, brittle glue code and unrecoverable errors during ad-hoc evolution. AGP’s design objective is to decouple “what evolves” (i.e., the actual resources, such as prompts or tools) from “how evolution occurs” (the optimization logic), achieved through layered protocol abstraction and modular interface design. This supports robust composition, safe updates, and transparent auditability within evolving agent ecosystems (Zhang, 16 Apr 2026).
2. Resource Substrate Protocol Layer (RSPL)
RSPL models five first-class resource types as global protocol-registered entities:
- Prompt: natural language instruction templates.
- Agent: decision policies, frequently implemented as LLM-driven controllers.
- Tool: actuation interfaces, encapsulating native scripts, MCP tools, or agent skills.
- Environment: simulators or world-state dynamics.
- Memory: persistent state stores, including structured databases and log stores.
Each resource of type is formally defined by a registration record , where is a tuple comprising unique name, description, I/O contract, evolvability flag, and metadata; is the semantic version string; is the implementation descriptor; are constructor parameters; and are exported schemas. Lifecycle operators include creation, retrieval, update (with atomic diff and version bump), deprecation, restore (rollback), and registry listing. All transitions are explicitly recorded, ensuring traceability and reversibility. The protocol leverages type-specific registries and global context managers, exposing server interfaces for agent access to current or historical resources (Zhang, 16 Apr 2026).
RSPL Resource Lifecycle Operators
| Operator | Input(s) | Effect |
|---|---|---|
| register | name, impl, desc | Create resource, version = 1.0.0 |
| get | name, version? | Retrieve specific instance handle |
| update | name, patch | Atomic update, version bump, returns diff |
| deprecate | name, version? | Mark instance as inactive |
| restore | name, version | Rollback to snapshot |
| list | type = T | Return names and versions |
These semantics enable agents and tools to be dynamically instantiated, hot-swapped, and evolved within well-specified, version-managed contracts.
3. Self-Evolution Protocol Layer (SEPL)
SEPL formalizes evolution as a closed-loop operator algebra over evolvable resources and execution artifacts. The space of evolvable variables, , spans all RSPL-resources and execution artefacts, with an evolvable indicator 0. The trainable subspace 1 collects all elements with 2.
SEPL defines five atomic operators:
- Propose (3): 4 — Extract hypotheses about failure or improvement from execution traces.
- Select (5): 6 — Generate concrete modification primitives (e.g., prompt edits).
- Improve (7): 8 — Apply edits via RSPL, yielding provisional candidates.
- Assess (9): 0 — Evaluate candidates against task objectives and safety constraints.
- Commit (1): 2 — Accept or rollback based on evaluation results.
Formally, the protocol is captured by: 3 with a comprehensive evolution loop tracking each proposal, modification, evaluation, acceptance, or rollback (Zhang, 16 Apr 2026).
All updates are versioned with explicit operator signatures, traces, and diffs, supporting global auditability and recovery.
4. Integration of RSPL and SEPL
At runtime, agents access prompts, tools, or other resources via the RSPL server interface; missing or stale resources are dynamically registered or updated by context managers. SEPL operators orchestrate mutation and testing of these resources through standard RSPL interfaces, guaranteeing versioned, reversible state transitions. AGP therefore renders the self-modification and evolution process protocol-governed, rather than ad hoc.
A canonical example is prompt evolution: starting with an initial prompt resource, the system executes, reflects on observed failures (e.g., missing solution coverage), proposes concrete prompt edits, applies these as new resource versions, evaluates outcome metrics, and commits or rolls back, with the process and versions fully logged (Zhang, 16 Apr 2026).
5. Autogenesis System (AGS): Implementation in Practice
AGS instantiates AGP within a full-stack, multi-agent environment. Primary architectural components include:
- Agent Bus: a centralized publish/subscribe bus for inter-agent coordination.
- Orchestrator Agent: decomposes tasks, manages plans (tracked as versioned resources).
- Sub-Agents: e.g., deep researcher, browser agent, tool-calling agent, dynamically orchestrated.
- RSPL Services: context managers and server APIs for all resource types.
- Key Infrastructure: Model Manager (LLM abstraction), Version Manager (semantic versioning, rollback), Dynamic Manager (resource hot-swap), and Tracer Module (execution trace capture).
The system workflow initiates with user task ingestion, plan registration, distributed sub-agent execution, trace and result logging to Memory, and orchestration of SEPL-driven evolution loops triggered by error detection or suboptimality. Evolution may proceed in parallel with standard execution, and successful resource improvements are immediately propagated via the versioned registry (Zhang, 16 Apr 2026).
6. Empirical Evaluation and Results
AGS was evaluated on benchmarks emphasizing long-horizon reasoning, planning, and heterogeneous tool use:
- GPQA-Diamond: graduate-level STEM MCQs (198 questions).
- AIME24/25: challenging mathematical problem sets.
- GAIA Test: 300 multi-step real-world planning and tool-use tasks.
- LeetCode Coding: 100 algorithmic problems in five programming languages.
Key findings include:
- On AIME24, weak models (gpt-4.1) achieved up to +71% improvement using evolved prompts/solutions, while strong models (g3-flash) attained +7%.
- On GAIA, baseline success of 79.1% improved to 89.04% with tool evolution (+12.6%). For the most challenging tier, gains reached +33.3%.
- LeetCode pass@1 rates improved by 10–26% and runtime was reduced by 8–46%. C++ runtime-competitiveness versus human submissions increased by +30%.
- Combined evolution of prompt and solution resources outperformed single-strategy interventions.
- Evolution strategies implementing SEPL (e.g., reflection, TextGrad, GRPO) showed complementary strengths, being instantiated as protocol-governed operator sequences (Zhang, 16 Apr 2026).
Qualitative ablations demonstrate the protocol’s effectiveness in evolving brittle tools into robust modules. For example, a web scraping tool was autonomously upgraded from fragile heuristics to DOM-traversal logic in response to failure traces.
7. Limitations, Scalability, and Future Directions
Potential failure modes of AGP include operator misdiagnosis—in which erroneous reflection leads to non-constructive edits—overfitting to narrow task domains, and registry bloat from unrestrained tool synthesis. Scalability is sensitive to context manager overhead, which grows with resource cardinality and demands efficient indexing and retrieval (e.g., via approximate nearest-neighbor search), as well as trace storage, which may require sampling or pruning.
AGS assumes that LLM-driven reflection is reliable enough to guide evolution and that task objectives are well specified; poor reward specification can degrade performance. Future directions include automated learning-to-rank for hypothesis selection, cross-task transfer via resource generalization, expansion of RSPL entity types to include structural objects and multi-modal policies, hybrid optimizer instantiation within the SEPL loop, and decentralized AGP protocols for peer-to-peer agent evolution without central coordination (Zhang, 16 Apr 2026).
By formalizing resource evolution through protocol-level abstraction, AGP provides robust, auditable, and modular self-evolution for complex agentic systems, enabling a shift from manual prompt engineering towards systematic protocol engineering.