Papers
Topics
Authors
Recent
2000 character limit reached

Recursive Delegation Engine (RDE)

Updated 4 January 2026
  • Recursive Delegation Engine (RDE) is a framework that decomposes complex tasks by recursively delegating work to agents, using utility-based decision criteria.
  • It leverages algorithms such as epsilon-greedy, UCB, and Thompson Sampling to optimize agent selection in dynamic, multi-agent environments.
  • The system supports live configuration of decomposition criteria, detailed runtime logging, and replay tools for empirical analysis and tuning.

A Recursive Delegation Engine (RDE) provides a principled infrastructure for breaking down complex tasks into hierarchies of subtasks, dynamically creating agent chains via recursive delegation, and optimizing agent selection and orchestration strategies. In large-scale multi-agent systems—including those powered by LLMs—RDEs enable robust problem decomposition, modular tool-use, and adaptive routing of work through recursive calls, while leveraging rigorous learning-theoretic techniques (especially multi-armed bandit algorithms) for agent trust and policy optimization. RDEs support zero-shot delegation policies, live configuration of decomposition/termination criteria, and comprehensive instrumentation of run-time events for post-mortem analysis (Zhu et al., 2024, Oren, 2023).

1. Foundational Concepts and Mathematical Formulation

Recursive delegation occurs when an agent, upon receiving a task, must dynamically decide whether to execute, decompose, or delegate the task to another agent, potentially forming deep chains of delegation. The delegation topology is formalized as a directed graph G=(A,E)G=(A,E), where AA is the set of agents and EE encodes wiring: an edge (ab)(a\rightarrow b) indicates that agent aa can delegate to bb (Oren, 2023).

A task executed by a leaf node returns a binary reward r{0,1}r \in \{0,1\}, and this outcome is propagated backward: for each agent in the delegation chain, counters succ(a)succ(a) and fail(a)fail(a) are updated. Recursive agent selection leverages utility computations that propagate recursively through the delegation tree:

  • Recursive ϵ\epsilon-greedy: mixes a random selection over available agents with a maximization step, using propagated success probabilities.
  • Recursive UCB/Beta-UCB: accumulates statistics over nodes and uses propagation of confidence-adjusted means.
  • Recursive Thompson Sampling: samples posteriors for leaf nodes, propagating maximal sampled results upwards.

RDEs for LLM-driven systems formalize agent decisions over direct execution versus decomposition/delegation using a utility criterion such as U(instr)=α(1c)+βLLmaxU(\text{instr}) = \alpha(1-c)+\beta\frac{L}{L_{max}}, driving whether to decompose instructions based on confidence and context window constraints (Zhu et al., 2024).

2. System Architecture and Data Flow

An RDE orchestrates multi-agent recursion through the following components (Zhu et al., 2024):

Component Main responsibilities Interactions
User Interface Receives and visualizes queries, run logs Streams events, shows delegation graph
Task Manager / Root Agent Initiates RDE, sets global configs Spawns root agent, maintains recursion depth
Delegation Controller Manages delegation schemes (DelegateOne/Wait) Exposed as a tool to agents, manages spawning
Agent Objects Encapsulate LLM + tools, parent/child pointers Holds chat buffers, interacts with controller
Tool Interfaces Modular Python APIs for function-use Invoked by agents via AI function calls
Logging Module Event bus for system state, token counts, comms Serializes events to JSONL, supports replay
Web UI / Replay Engine Interactive and historical visualization Reconstructs runs, inspects per-agent history

The runtime proceeds by the root agent receiving a query, computing whether to handle directly or decompose. Subtasks trigger the creation of child agents, which may themselves recursively decide the same. Results propagate upward, and all steps are logged for replay and analysis.

3. Algorithmic Basis for Recursive Agent Selection

RDEs employ recursive, delegation-aware multi-armed bandit (MAB) rules to optimize agent choices. The modifications to standard MAB algorithms are as follows (Oren, 2023):

  • ϵ\epsilon-Greedy Utility Propagation At each agent aa, the utility Uϵ(a,chain)U_\epsilon(a,\text{chain}) is recursively calculated:

Uϵ(a,chain)=ϵ1ScSUϵ(c,chain{a})+(1ϵ)maxcSUϵ(c,chain{a})U_\epsilon(a,\text{chain}) = \epsilon \cdot \frac{1}{|S|} \sum_{c \in S} U_\epsilon(c, \text{chain} \cup \{a\}) + (1-\epsilon) \cdot \max_{c \in S} U_\epsilon(c, \text{chain} \cup \{a\})

where S=S= available neighbors minus the current chain.

  • UCB/Beta-UCB Propagation For each agent in the chain:

Uucb(a,chain)=maxcSUucb(c,chain{a})U_{ucb}(a, \text{chain}) = \max_{c \in S} U_{ucb}(c, \text{chain} \cup \{a\})

Leaf nodes compute their mean success rate plus an exploration bonus, with Beta-UCB using the standard deviation of the Beta posterior.

  • Thompson Sampling For each task, sample θ(a)\theta(a) from the Beta posterior at leaves and propagate the maximal sample upwards.

These policies are shown to outperform flat (non-recursive) MAB rules when the delegation topology is deep or highly branching.

4. Dynamic Decomposition and Termination Criteria in LLM Systems

In LLM-based recursive delegation, the decomposition process is model-driven. Agents decide at inference whether to execute, pare down instructions, or spawn sub-agents. Key elements include (Zhu et al., 2024):

  • Utility Calculation for Decomposition: Agents compute U(instr)U(\text{instr}) as a linear combination of model confidence and instruction length penalty.
  • Configurable Depth and Termination: Depth limit DmaxD_{max} prevents infinite recursion and termination hooks permit domain-specific overrides.
  • Delegation Schemes:
    • DelegateOne blocks until child completion.
    • DelegateWait allows parallel non-blocking child execution.

Custom delegation controllers and tool registries can be plugged in modularly, and concurrency parameters are adjustable for workload management.

5. Instrumentation, Replay, and Empirical Evaluation

All RDEs are instrumented with comprehensive event logging (Zhu et al., 2024):

  • Every agent spawn, state transition, token count, chat message, and tool call emits a discrete event on the event bus.
  • Events are serialized into a JSONL trace for post-hoc analysis (e.g., per-agent token consumption, delegation graph evolution).
  • The Web UI allows interactive replay—stepping through each event, inspecting agent states, and visualizing the full delegation tree.

Empirical evaluations in synthetic delegation graphs (random binomial, scale-free) show that recursive delegation-aware MAB variants (Thompson Sampling, Beta-UCB) consistently outperform baseline policies, especially in deeper graphs (Oren, 2023):

  • Thompson Sampling is robust across topologies.
  • Beta-UCB performs well when exploration parameters are carefully tuned.
  • Classic flat MAB is sufficient for shallow graphs but degrades as recursion depth increases.

6. Limitations and Open Research Problems

Observed failure modes include overcommitment (agents exceed model context), undercommitment (trivial tasks delegated until depth cutoff), and challenges in tuning decomposition thresholds τdecompose\tau_{\text{decompose}} and depth limits DmaxD_{max} (Zhu et al., 2024). Empirical results show that classic UCB can severely over-explore in deep delegation trees unless its constant is carefully set (Oren, 2023).

Open questions include:

  • Regret bounds for recursive MAB over delegation graphs with depth DD and branching BB remain unproven (Oren, 2023).
  • Extending schemes to non-binary, cost-aware payoffs, partial observability of the delegation topology, non-truthful agent reporting, and reputation integration.
  • Adaptive delegation policies using reinforcement learning (over U()U(\cdot) thresholds), human-in-the-loop integration, and cost/latency-aware scheduling (Zhu et al., 2024).

This suggests that fully functional RDEs must balance recursive learning, live configuration, and robust instrumentation, while leveraging recursive utility propagation to optimize both algorithmic performance and interpretability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Recursive Delegation Engine (RDE).

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube