IACT: A Self-Organizing Recursive Model for General AI Agents: A Technical White Paper on the Architecture Behind kragent.ai

Published 2 Dec 2025 in cs.AI, cs.MA, and cs.SE | (2512.02605v1)

Abstract: This technical white paper introduces the Interactive Agents Call Tree (IACT), a computational model designed to address the limitations of static, hard-coded agent workflows. Unlike traditional systems that require pre-defined graphs or specialized programming, IACT operates as a general-purpose autonomous system driven purely by user dialogue. Given a high-level objective, the system autonomously grows a dynamic, recursive agent topology incrementally tailored to the problem's structure. This allows it to scale its organizational complexity to match open-ended tasks. To mitigate the error propagation inherent in unidirectional function calls, IACT introduces interactional redundancy by replacing rigid invocations with bidirectional, stateful dialogues. This mechanism enables runtime error correction and ambiguity resolution. We describe the architecture, design principles, and practical lessons behind the production deployment of this model in the kragent.ai system, presenting qualitative evidence from real-world workflows rather than exhaustive benchmark results.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a recursive call tree that enables self-organizing, adaptive agent workflows for LLM-based systems.
It details dynamic context construction and state isolation techniques that mitigate token degradation and enhance inference efficiency.
Empirical validation shows robust performance in multi-phase engineering and research, laying a foundation for scalable automation.

Interactive Agents Call Tree (IACT): Autonomous Recursive Agent Architecture for General AI

Architectural Motivation and Core Design Principles

Traditional LLM-based agent systems are constrained by static workflow topologies and brittle unidirectional pipelines. IACT introduces a paradigm centered on dynamic, recursive agent topologies designed to emerge organically as execution unfolds. Central design principles include:

Dynamic Context Construction: Instead of static system prompts, IACT employs dynamic instruction injection, shaping execution through event-driven context modifications. This supports environmental grounding, exception handling via direct error injection, and runtime context management.
LLM-Centric Control Flow: The locus of workflow control resides within the LLM agent, which autonomously determines agent instantiation, invocation, vertical communication, and tool utilization. The architecture offers infrastructural elements and global principles, but the agent decides execution details, yielding an adaptive topology responsive to problem complexity.

Macro-Architecture: The Recursive Agent Call Tree

Topology Derivation and Rationale

IACT adopts a tree-based decomposition, rejecting cyclic graphs and linear chains due to their inability to effectively manage context coherence and enable parallelizable, hierarchical problem breakdown. Each agent in the tree operates with isolated state (context window), mirroring function encapsulation. The parent agent orchestrates subtree coherence by maintaining a "single source of truth," with instructions propagating downward and status reports upward.

Stateful Dialogue and Interactional Redundancy

Unlike traditional function trees, IACT elevates calls to persistent, bidirectional dialogues, creating interactional redundancy. This supports runtime error correction and ambiguity resolution. The architecture enforces:

Cohesive Decomposition: Tasks are split into self-contained modules, minimizing inter-agent dependency and complexity.
Exclusive Ownership: Persistent resources are ideally modified by one agent per task cycle.
Relevant State Synchronization: Parents actively synchronize context deviations through targeted dialogues.

Context Isolation, Efficiency, and Memory Hierarchy

IACT’s recursive structure enforces strict context isolation, so each agent operates within an optimal context window, reducing reasoning degradation ("Lost in the Middle" phenomenon) and maximizing inference efficiency. Higher-level nodes retain distilled summaries while lower-level agents manage raw details, forming a hierarchical memory that extends total system context far beyond the bounds of a single agent.

Sequential and Parallel Execution Models

Current deployment uses sequential agent execution, with parents suspending control when awaiting child completion. The design is compatible with future parallel extensions, allowing asynchronous agent spawning and tool integration for scalable I/O processing.

Advanced Control Flow Patterns

Vertical Escalation: Leaf agents invoke human/user intervention for high-level decisions.
Dynamic Sibling Delegation: Agents request specific information, prompting parent-dispatched sibling instantiation.
Lazy Evaluation: Agents generate and transmit large outputs in manageable segments, supporting infinite-length task handling.
Iterative Re-entrancy: Trees persist beyond session boundaries, supporting targeted refinement or correction through subsequent user intervention.

Micro-Architecture: Agent Node Design

Perception-Action Loop and Hybrid Interpreter

Agents operate as recurrent processes with dynamic context construction integrating static prompts, execution history, input signals, and real-time system notifications. The LLM generates a response, which is parsed and executed by the Hybrid Language Interpreter, driving internal (variable definition, context compression, agent invocation) and external (tool execution) actions.

Symbolic Variable Mechanism and Just-in-Time Tool Synthesis

Structured data passing leverages symbolic variables for efficient communication, supporting multimodal content exchange and distributed data availability. Tool integration is managed via pattern-based dynamic loading, runtime extension through RPC design, and new module synthesis, ensuring actions adapt to evolving requirements and minimize context pollution.

State-Machine-Based Tool Paradigm and KV Cache Optimization

Tools are modeled as state machines, exposing only relevant actions at each execution state, reducing cognitive and hallucination load when interacting with the LLM. Context structure is optimized for KV cache efficiency, and context compression is triggered dynamically for sustained operational performance.

Unified Agent Communication and Observability

Extended Markdown Protocol

IACT standardizes communication via Extended Markdown, supporting polymorphic multimodal data embedding. Message interpretation adapts to user frontends or model sensory capabilities, decoupling information logic from agent modality. This protocol ensures seamless collaboration between text-only and multimodal agents.

Observability and Interactive Supervision

Sequential execution results in human-readable, reconstructible logs, facilitating observability and auditability. Users may intervene at any execution depth through direct message injection, shifting debugging from code-modification to natural language correction of cognitive misalignments.

Global Associative Memory Integration

To mitigate leaf-node tunnel vision, IACT incorporates a global associative memory ("Hippocampus")—a vector database for long-term project knowledge. During execution, semantic retrieval injects relevant global facts, goals, and preferences into local context, balancing isolation with coherence across deep agent hierarchies.

System Implementation, Security, and Operational Lessons

Deployment follows a distributed multi-process model with strict runtime isolation: core IACT logic is separated from externally executed tool modules. Security is enforced through sandboxing and careful secret management. Keyless operations avoid exposing credentials to LLM contexts via environment variables or ephemeral web apps.

Production usage has revealed practical challenges such as multi-party dialogue parsing friction, behavioral passivity under uncertainty, and associative memory bottlenecks. The architecture, however, demonstrates efficient token economy, leveraging context isolation and symbolic variable references to minimize computational overhead for complex, long-horizon tasks.

Empirical Validation and Compatibility

Case studies highlight IACT's proficiency in multi-phase engineering (autonomous app deployment) and research workflows, operating effectively with a spectrum of LLM models (from open-source to proprietary). Notably, raw model competence is insufficient; models overly fine-tuned for rigid protocols may become incompatible with IACT’s text-to-action schemes.

Implications and Theoretical Prospects

IACT reconceptualizes agentic system architecture for general AI, discarding developer-scheduled flows for architectures that self-organize recursively according to live problem requirements. This unlocks complexity scaling, robust error correction, and organizational depth for real-world automation.

Practical Implications: Enables automated software engineering, data science, and research workflow orchestration without manual scripting.
Theoretical Prospects: Offers a scalable scaffold for integrating heterogeneous models and toolchains, blending probabilistic reasoning with robust, observable execution.
Future Directions: Parallel agent execution, more sophisticated global memory consolidation, and model optimization for multi-party active querying and dialogue distinction.

Conclusion

The Interactive Agents Call Tree establishes a foundational architecture for autonomous, adaptive agent systems pursuing general AI objectives. By leveraging recursive, stateful dialogues and dynamic agent topologies, it combines efficiency, observability, and robustness, demonstrating viable autonomous workflows today across diverse domains. IACT’s methodological shift from static workflows to emergent agent organizations marks a significant advance in aligning LLM-based systems with the demands of complex, real-world tasks.

For detailed implementation and experimentation, the system is accessible via kragent.ai, supporting live deployments in engineering and academic domains as a research-first execution environment.