Dynamic Task Decomposition

Updated 29 November 2025

Dynamic task decomposition is an adaptive process that partitions complex objectives into context-sensitive subtasks using execution feedback.
It is applied in LLM multi-agent systems, MPC, and web automation to enhance real-time planning and mitigate error propagation.
Empirical results show improvements in sample efficiency, faster task completion, and robust performance across various autonomous applications.

Dynamic task decomposition refers to the process of adaptively partitioning a complex objective into structured, context-sensitive subtasks during execution, rather than performing a static, a priori decomposition. This paradigm appears across LLM agent frameworks, model predictive control (MPC), signal temporal logic (STL) in multi-agent systems, cooperative reinforcement learning, and web automation. Dynamic decomposition enhances adaptability and sample efficiency, mitigates error propagation, and supports real-time plan revision in response to environmental feedback or execution failures. It is now foundational for autonomous agents and hierarchical controllers in domains where task structure, tool requirements, or solution pathways cannot be fully anticipated in advance.

1. Formal Foundations and Problem Setting

Dynamic task decomposition extends classical, static task breakdown by making the decomposition operator itself conditional on upstream execution results, environmental feedback, and the evolving agent context. Formally, given a high-level objective $T$ and a history of partial results $r_{1:i-1}$ , a dynamic decomposition operator outputs updated subtasks $t_i'$ :

$t_i' = \mathcal{U}(t_i ; r_1, \ldots, r_{i-1})$

where $\mathcal{U}$ is an update operator informed by prior execution. This mechanism appears in LLM-based multi-agent systems as adaptive assignment and content revision of subtasks (Wang et al., 15 Feb 2024), in MPC as experience-driven initialization for new task orderings (Vallon et al., 2019), and in STL-based multi-agent planning as communication-consistent, conflict-free decomposition (Marchesini et al., 16 Oct 2024, Marchesini et al., 27 Feb 2024). Dynamic decomposition thus formalizes sequential, feedback-driven subtasking tightly integrated with execution phases.

Context-sensitive decomposition can be represented as a function $\varphi: (Q, C) \rightarrow G$ mapping a user query $Q$ and context $C$ to a dynamically constructed task graph $G = (V, E)$ , where nodes encapsulate subtasks and edges encode dependencies (Gabriel et al., 29 Oct 2024). Real-time operations may include subgraph augmentation, node splitting/merging, and on-the-fly tool re-assignment as new execution data arrives.

2. Algorithmic Realizations across Domains

Dynamic task decomposition manifests in distinct yet structurally analogous forms across several application areas:

LLM Multi-Agent Systems: In frameworks such as TDAG, a main agent decomposes $T$ into initial subtasks, spawns per-subtask LLM subagents, and updates downstream subtasks (and generates new subagents) in response to failures or novel findings. This pipeline includes skill retrieval, message-passing, error-driven subtask revision, and continuous knowledge base expansion. Subagents are generated dynamically with tools and APIs refined per subtask; execution results inform further decomposition (Wang et al., 15 Feb 2024).

(t1, t2, ..., t_n) ← MainAgent.decompose(T)
...
if prior_failure:
    for future subtasks:
        t_j ← MainAgent.update(t_j; previous_results)
SubAgent_i ← AgentGenerator.create_agent(t_i)
r_i ← SubAgent_i.execute(...)

Subagent summaries are added to a skill library if sufficiently novel.

Model Predictive Control (MPC): Dynamic decomposition appears in “Task Decomposition for Iterative Learning Model Predictive Control” and TDMPC for LTV systems (Vallon et al., 2019, Vallon et al., 2020). Given demonstrations of a task decomposed into subtasks, the system dynamically recomposes safe sets and control policies for new orderings by verifying local controllability along transition states. Convex optimization is used to prune infeasible transitions, providing recursive feasibility without global re-planning.
LLM Inference Scaling: Methods such as DISC use dynamic partitioning of solution traces (reasoning or code) into increasingly fine-grained steps during inference. The partitioning criterion (priority metric) is computed via Q-values or Z-scores of rollout rewards; harder segments are recursively split and allocated greater sampling budget, until desired solution quality is reached (Light et al., 23 Feb 2025).
Agentic Toolchains and Web Automation: In agentic systems, dynamic task decomposition builds context-driven directed acyclic graphs (DAGs) where task structure evolves in response to execution traces and tool output. For web automation, as exemplified by WebDART, complex chores are split into navigation, extraction, and execution subtasks, with recalculated navigation and extraction plans as new webpage affordances (filters, shortcuts) are revealed through exploration (Yang et al., 8 Oct 2025, Gabriel et al., 29 Oct 2024).
Multi-Agent STL Planning: In distributed control, global STL specifications are dynamically re-partitioned over communication graphs. Decomposition occurs recursively along multi-hop communication paths, using convex programming to maximize feasible regions and exclude conflicting conjunctions. These modifications ensure communication-consistent, satisfiable task graphs for decentralized feedback law assignment (Marchesini et al., 16 Oct 2024, Marchesini et al., 27 Feb 2024).

3. Dynamic Decomposition: Evaluation and Empirical Results

Dynamic task decomposition yields measurable improvements across several benchmarks and modalities:

Multi-Step Agent Task Performance: In the TDAG evaluation on ItineraryBench (travel-planning), the dynamic approach delivers the highest average total score (49.08/100), outperforming ReAct (43.02), zero-shot P&S (43.68), static P&E (42.85), and ADAPT (44.74). Ablation confirms both dynamic decomposition and per-subtask agent generation as essential—removing either decreases mean performance by ≈2.5–3 points (Wang et al., 15 Feb 2024).
Sample Efficiency in Control: In TDMPC for MPC, initializing from decomposed task data achieves ≈30% faster first-lap performance and 50% fewer trials to reach local optima in both autonomous racing and robotic manipulation than non-decomposed approaches (Vallon et al., 2019, Vallon et al., 2020).
LLM Solution Quality: DISC achieves error reductions of 5% (APPS), 6.7% (MATH500), and 10.5% (LiveCodeBench) versus fixed or token-level splitting baselines, for equivalent sample budgets (Light et al., 23 Feb 2025).
Parallel and Sequential Agent Systems: Dynamic DAG construction for tool-augmented agents improves answer quality and tool-use precision, with structural metrics (SSI) most predictive in sequential tasks and Tool F1 in parallel decompositions; statistical correlation is confirmed at p < 0.001 (Gabriel et al., 29 Oct 2024).
Web Automation: Dynamic re-planning in WebDART lifts WebChoreArena success rates by 7.7 to 13.7 percentage points over static baselines, and reduces navigation actions by up to 14.7 steps. The adaptive pipeline preserves SOTA on simpler web benchmarks (Yang et al., 8 Oct 2025).

4. Communication Consistency and Decentralized Decomposition

Dynamic decomposition in multi-agent and STL contexts focuses on reconciling global task graphs with communication-topology constraints. For tasks defined over agent pairs lacking direct communication, the technique recursively decomposes predicates and temporal logic formulas onto 1-hop communication paths. The approach uses convex optimization to maximize the volume of predicates’ super-level sets while ensuring set inclusion (via Minkowski sums), conflict-avoidance, and communication consistency. Solutions are decentralizable via dual decomposition or ADMM over communication-edge variables, guaranteeing satisfaction of the original task if the subproblems are feasible (Marchesini et al., 16 Oct 2024, Marchesini et al., 27 Feb 2024).

This mechanism generalizes to dynamic adaptation under topology changes: as the communication graph changes, the system automatically re-decomposes the task, ensuring persistent realizability and distributed controllability.

5. Hierarchical and Multi-Agent Dynamic Decomposition in RL

In cooperative multi-agent reinforcement learning, dynamic task decomposition is realized via hierarchical policies—high-level agents dynamically assign subtasks per episode or execution interval, while low-level agents specialize to the dynamics of their assigned subtasks. Recent approaches (e.g., C $\text{D}^3$ T) learn a subtask space via a conditional diffusion model, clustering action embeddings into “effect classes,” and use multi-head attention mixing networks for efficient joint value composition and credit assignment. These structures enable sample-optimal, robust coordination under partial observability, with empirically validated advantages over fixed subtasking and earlier MARL baselines (Zhu et al., 17 Nov 2025).

6. Adaptive Decomposition Selection and Cost-Performance Tradeoff

Dynamic task decomposition can include meta-level algorithms that select decomposition paradigms at runtime based on task properties, execution model strength, and cost constraints. The Select-Then-Decompose (S D) framework distinguishes selection, execution, and verification phases. An explicit model selects the decomposition method (implicit/explicit, first/interleaved, DAG/linear), executes accordingly, then verifies confidence, iterating or fallback-switching if needed.

Across diverse benchmarks, S D lies consistently on the cost–performance Pareto frontier, achieving optimal or near-optimal trade-offs between token usage, number of API calls, and solution accuracy (Liu et al., 20 Oct 2025). This demonstrates that task-aware dynamic decomposition selection, coupled with runtime verification and adaptive switching, addresses the fixed-method performance–cost dilemma and generalizes across problem types.

7. Limitations and Open Directions

Current dynamic decomposition frameworks exhibit certain limitations:

Domain specificity: Benchmark and deployment contexts (e.g., ItineraryBench, AsyncHow) are often restricted to travel planning, web tasks, or specific MARL environments. Broad multi-domain evaluation remains underexplored.
Computational overhead: The dynamism of decomposition and agent/subpolicy instantiation introduces higher latency and computational cost than static pipelines (Wang et al., 15 Feb 2024).
Handling conflicting conjunctions: In STL settings, careful conflict exclusion is required to avoid unsatisfiable decompositions under communication and logic constraints (Marchesini et al., 16 Oct 2024, Marchesini et al., 27 Feb 2024).
Adaptivity granularity: Simple token-based splits may occasionally misalign with semantic task boundaries; integrating semantic or context-aware partitioning could further improve efficiency (Light et al., 23 Feb 2025).
Model selection policy: Though S D demonstrates strong empirical results, further research into learning adaptive selectors beyond prompt design is needed (Liu et al., 20 Oct 2025).

Future work directions include learning adaptive decomposition or splitting policies, merging with semantic-aware decomposition, integrating learned error-handling policies, extending to hierarchical meta-decomposition, and broadening large-scale, multi-domain evaluation. Open questions include theoretical analysis of the sample complexity and robustness properties of dynamic decomposition under various noise and feedback regimes.

Key References: