Hierarchical Prompt Expansion Agent
- Hierarchical Prompt Expansion Agent is a multi-layered system that decomposes complex tasks and refines LLM prompts using gradient-based updates.
- It leverages meta-prompt sharing and recursive communication between agent layers to achieve dynamic and efficient task optimization.
- Empirical results demonstrate significant improvements in complex task success rates, highlighting the critical role of hierarchical structure and prompt adaptation.
A Hierarchical Prompt Expansion Agent (HPEA) is a multi-layered artificial agent system that employs hierarchical decomposition and prompt optimization strategies to solve complex reasoning, planning, or perception tasks, typically leveraging LLMs as both decomposers and executors. The HPEA paradigm is characterized by explicit agent layering, dynamic prompt adaptation at both individual and collective levels, and integration with downstream classical planners, symbolic solvers, or perception modules. This architecture enables robust performance across applications such as multi-robot task planning, vision-language perception, multi-agent retrieval-augmented generation, and decision-making in strategic environments.
1. Hierarchical Structure and Decomposition
HPEA organizes agents in discrete layers, each responsible for increasingly granular task reasoning or execution. In the context of multi-robot task planning, the topmost "global" agent receives a user instruction and decomposes it into high-level subtasks. These subtasks are distributed to intermediate-layer agents, which further refine and allocate tasks, often based on robot capabilities or domain specificity. Leaf agents at the lowest layer translate subgoals into concrete Planning Domain Definition Language (PDDL) specifications, which are then executed by classical solvers (Kawabe et al., 25 Feb 2026).
This layered agent composition allows for recursive or parallelized task refinement—each agent operates on its own prompt, incorporates shared meta-prompts (representing collective domain knowledge), and communicates formatted subtasks or specifications to the next layer. The agent hierarchy is a direct realization of system-theoretic decomposition, facilitating both parallelizable handling of independent subtasks and sequential coordination for long-horizon, causally dependent objectives.
2. Prompt Optimization via Textual Gradients and Meta-Prompt Sharing
The agent's prompt text is conceptualized as an explicit parameter subject to gradient-based update, operationalized through the TextGrad framework. After executing a candidate plan, the agent receives failure signals as textual losses, prompting an update to its prompt via LLM-suggested edit operations such as insertion of missing conditions or clarifying constraints. The update follows:
where is the agent prompt, and specifies an LLM-guided textual gradient step (Kawabe et al., 25 Feb 2026).
To enhance efficiency and knowledge transfer, a shared meta-prompt is maintained at each layer. After agent-level updates, textual losses are aggregated, deduplicated, and normalized at the layer level, yielding a meta-gradient which is then used to update . This meta-prompt sharing allows individual agents to benefit from prompt refinements discovered by their layer peers, accelerating convergence and providing a mechanism for domain-wide prompt adaptation.
3. Formalization and Algorithmic Workflow
The HPEA framework is algorithmically structured as follows:
- User input is received by the top-layer agent(s), producing decomposed subgoals through LLM-driven analysis.
- Outputs flow downward: each agent receives a composite prompt—jointly the layer-shared meta-prompt, its individual prompt, and the task description.
- Each agent's output can be either a further refined subgoal or a directly executable specification (e.g., PDDL problem).
- After classical plan synthesis and validation, failures induce prompt updates at both the agent and meta-prompt levels via textual gradient optimization.
- This process iterates until all sub-plans succeed or a maximum retry threshold is reached.
Edges between agents strictly follow the hierarchical decomposition, maintaining a lightweight but flexible communication graph.
Empirical evaluation on the MAT-THOR benchmark demonstrates the HPEA's capability, with success rates of 0.95 for compound, 0.84 for complex, and 0.60 for vague tasks, consistently surpassing prior state-of-the-art planners such as LaMMA-P (Kawabe et al., 25 Feb 2026).
4. Ablation Studies and Component Contributions
Systematic ablation studies quantify the contribution of each architectural and optimization component to task success. Removing the hierarchical structure reduces complex-task success rate from 0.84 to 0.25, eliminating prompt optimization and meta-sharing drops it to 0.47, while removing only meta-sharing yields a more minor decrease to 0.80 (Kawabe et al., 25 Feb 2026).
| Variant | SR (Complex) | Δpp |
|---|---|---|
| Full (H+P+M) | 0.84 | — |
| –H | 0.25 | –59 |
| –(P, M) | 0.47 | –37 |
| –M | 0.80 | –4 |
This demonstrates that hierarchy is the critical determinant of system performance, while prompt optimization offers substantial, and meta-sharing modest, complementary gains.
5. Generalizations and Domain Adaptation
The HPEA paradigm generalizes to other domains where decomposable reasoning or perception is needed. In multi-agent retrieval augmented generation (RAG), a similar hierarchical arrangement is found in HERA, where the orchestrator adaptively samples multi-agent topologies and each agent prompt is refined via role-aware, dual-axis (operational and behavioral) adaptation. Network sparsity and compactness emerge naturally as the orchestrator accumulates experience and prunes redundant agents, with measured improvements in agent connectivity and solution efficiency (Li et al., 1 Apr 2026).
In vision-LLMs, hierarchical prompt strategies enable dynamic adjustment of prompt complexity depending on instance difficulty, using cross-level structured knowledge and gated attention (Wang et al., 2023). Hierarchical prompt structures also outperform flat or static prompt schemes in complex embodied domains (robotics, task and motion planning) (Źróbek et al., 8 May 2026), as well as in complex strategic games requiring tactical hierarchy (Li et al., 16 Feb 2025).
6. Empirical Impact and Benchmarks
On MAT-THOR, HPEA achieves best-in-class performance across compound, complex, and vague tasks, and ablation indicates the dominance of hierarchical decomposition and prompt optimization strategies (Kawabe et al., 25 Feb 2026). Across domains, such as multi-hop question answering, retrieval-augmented generation, and class-agnostic object detection, similar hierarchical prompt expansion strategies have yielded significant improvements—with, for example, HERA achieving a mean F1 improvement of +38.7% over SOTA RAG baselines on six benchmarks (Li et al., 1 Apr 2026).
Hierarchical prompt expansion continues to be a key design paradigm for advancing robust, interpretable, and scalable LLM-based agents in both reasoning and embodied domains.