Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-Plan Search Strategies

Updated 28 February 2026
  • Multi-plan search is a set of algorithmic methods that systematically constructs and evaluates multiple candidate plans in uncertain, high-dimensional environments.
  • It leverages techniques like code grounding, hierarchical expansion, and Pareto front evaluation to balance trade-offs and enhance plan robustness.
  • Empirical results demonstrate that multi-plan search outperforms single-plan methods by achieving superior plan diversity and optimal trade-offs while efficiently managing computational complexity.

Multi-plan search encompasses a family of algorithmic approaches that, instead of seeking a single optimal or satisficing plan to solve a planning problem, systematically explore and construct a collection or set of candidate plans. These methods are fundamentally motivated by environments characterized by uncertainty, multiple objectives, and/or large combinatorial action spaces where single-path search is inadequate or inherently suboptimal. Multi-plan search naturally enables robustness, supports trade-off reasoning, and is critical in settings where plan evaluation requires sampling, ensemble construction, or context-sensitive adaptation.

1. Formal Problem Statements and Core Frameworks

The formalization of multi-plan search is context-dependent, encompassing diverse frameworks across code-use planning, decision-theoretic planning under uncertainty, and multi-objective path/search problems.

  • Repository-grounded Plan Search: MutaGReP conceptualizes the plan space P\mathcal P as sequences of natural-language steps, each step x=(t,B)x=(t, B) comprising an intent tLt\in L and a code-symbol set BBB\subseteq \mathcal B from a code repository. The successor function s:PP(P)s: P\rightarrow\mathcal P(P) generates plan mutations; the grounding function g:LP(B)g: L\rightarrow \mathcal P(\mathcal B) anchors steps in relevant repository symbols. Optimization seeks the plan p=argmaxpCh(p)p^* = \arg\max_{p \in \mathcal C} h(p) where hh captures fidelity and feasibility, and C\mathcal C is the set generated within expansion budget BB (Khan et al., 21 Feb 2025).
  • Planning under Uncertainty: U-Plan builds plans for each possible world (“P-state”) derived from a Dempster–Shafer interval representation of incomplete evidence. Plans are constructed over an abstraction hierarchy, scored via expected fulfillment: EF(a)=F(c)P(ca,ps)\mathrm{EF}(a) = F(c) \cdot P(c|a, \mathrm{ps}). Multiple plans are maintained and merged into a super-plan with knowledge acquisition for runtime branch selection (Mansell et al., 2013).
  • Multi-objective Path/Policy Search: In Pareto D* (D*-PO) and related algorithms, the state space is annotated with sets of cost-vectors. Multi-plan search requires computing and maintaining the Pareto front of non-dominated cost vectors at each node, representing the set of optimal paths w.r.t. all objectives (Lavin, 2015). In MOSSPs, the solution is the convex coverage set (CCS) of achievable cost vectors at the root, and multi-plan search seeks the set of policies realizing all points in CCS (Chen et al., 2023).

2. Algorithms and Search Strategies

Algorithms for multi-plan search adapt both state expansion logic and candidate management to maintain plan diversity and optimize multiple criteria:

  • Neural Tree Search in Plan Space (MutaGReP): A best-first (or depth-first) tree search expands partial plans via LLM-based mutations and symbol grounding. The search frontier maintains a set of plan candidates with node-local scoring. Expansion alternates between monotonic (step-adding) and unconstrained (editing, rewriting) plan mutations (Khan et al., 21 Feb 2025).
  • Best-first Hierarchical Planning (U-Plan): U-Plan prioritizes expansion by expected fulfillment within a hierarchical AND/OR tree. Each abstraction level selects the operator maximizing EF and refines subgoals. Backups propagate EF between levels, supporting plan revision and re-evaluation in light of detailed tactical expansions (Mansell et al., 2013).
  • Pareto-optimal Path Search (D*-PO, MOSSP): The open set tracks labeled cost vectors; at each expansion, only non-dominated (Pareto-optimal) labels are propagated. Paths or policies are reconstructed from backpointer chains associated with these labels, yielding a set of candidate plans that realize all optimal trade-offs among objectives (Lavin, 2015, Chen et al., 2023).
  • Heuristic Multi-objective Methods (MOLAO*, MOLRTDP): Algorithms extend LRTDP and LRTDP-AO* to maintain sets of value vectors via Bellman-style vector backups. Policy sets are updated with action support for all coverage-set points. Heuristics are constructed to be admissible w.r.t. vector domination (Chen et al., 2023).

3. Plan Mutation, Diversity, and Scoring

Effective multi-plan search hinges on mechanisms for generating, scoring, and selecting among diverse candidate plans:

  • Mutation Operators: In MutaGReP, successors are generated either by monotonic extension (adding a step) or unconstrained mutation (insertion, deletion, reordering, rewriting of plan steps) using LLM prompt completions (Khan et al., 21 Feb 2025).
  • Grounding and Retrieval: Plan steps are grounded via retrieval of code symbols using embedding-based matching over synthetic intent sentences, ensuring compactness and relevance (Khan et al., 21 Feb 2025).
  • Scoring Functions: Scoring may target plan diversity (unique symbol coverage), LLM-based Likert judgments of plan fidelity and feasibility, or oracle overlap with reference solutions (Khan et al., 21 Feb 2025). In D*-PO and MOSSP, dominance relations and coverage-set representations (CCS or Pareto front) define solution quality, supporting explicit multi-objective trade-off extraction (Lavin, 2015, Chen et al., 2023).
  • Plan Reuse and Super-plan Construction: U-Plan attempts maximal plan reuse across P-states, merging valid/partial plans and introducing sensory knowledge acquisition only at divergence points to minimize redundancy and runtime ambiguity (Mansell et al., 2013).

4. Computational Complexity and Scaling

Multi-plan search introduces nontrivial computational and memory overhead due to candidate explosion and dominance processing:

  • Node and Candidate Management: In tree-based planners, total expansions are bounded by BfB \cdot f (budget × branching factor) with primary cost due to successor generation (LLM calls in code planning, Bellman backups in MOO planning) (Khan et al., 21 Feb 2025, Chen et al., 2023). Best-first policies exploit scoring to prioritize promising candidates.
  • Label and Coverage-set Complexity: For multi-objective search, the number of non-dominated labels (paths or policies) per state is combinatorial in the number of objectives but often remains tractable for modest nn (objectives). Pruning dominated labels is critical to limit growth; worst-case memory is O(SLmax)O(|S|\cdot L_{max}) (Lavin, 2015).
  • Heuristic Guidance: High-quality heuristics that capture both probabilistic and multi-objective structure (e.g. MOSSP-PDB) are necessary to focus search and bound the size of intermediate coverage sets (Chen et al., 2023).
  • Context Efficiency: MutaGReP demonstrates that multi-plan search can achieve high code overlap with full-repo solutions using <5%<5\% of the context window tokens, due to selective retrieval and symbolic grounding (Khan et al., 21 Feb 2025).

5. Comparative Evaluation and Empirical Results

Multi-plan approaches have been empirically shown to provide substantial benefits over single-plan and baseline methods:

  • Plan Diversity and Robustness: Multi-plan search enables the discovery of diverse decompositions, escapes local optima, and robustly covers hard instances where single-path methods underperform (Khan et al., 21 Feb 2025, Lavin, 2015).
  • Performance Metrics:
    • On LongCodeArena, plan search (multi-plan) achieved best-of-5 code overlap of 53.9%53.9\% (average 48.0%48.0\%) using 4.3%\sim 4.3\% of context, surpassing single-plan baselines and approaching the full-repo performance (58.7%58.7\%) (Khan et al., 21 Feb 2025).
    • D*-PO outperformed standard A* and D* on all cost metrics in multiobjective Mars rover simulations, and produced Pareto-optimal sets of alternative plans for operator selection (Lavin, 2015).
    • MOLRTDP and iMOLAO* solved more instances and produced larger, higher-quality coverage sets on stochastic multi-objective domains compared to previous heuristic approaches (Chen et al., 2023).
  • Super-plan Construction: U-Plan’s strategy of merging plans for all high-belief P-states and inserting knowledge acquisition for runtime branching guarantees coverage of major contingencies while minimizing duplication (Mansell et al., 2013).

6. Domain-specific Applications and Extensions

Multi-plan search underpins diverse advanced planning and reasoning applications:

  • Repository-grounded Code Generation: MutaGReP enables LLMs to synthesize compact, high-fidelity code plans by efficiently navigating large repositories. The approach generalizes to plug-and-play with weaker or open-source code LLMs for high-value transfer (Khan et al., 21 Feb 2025).
  • Robust Autonomous Navigation: Pareto D*-PO and related planners enable multi-criteria routing for planetary rovers or vehicles, supporting user-driven trade-off selection (“shorter vs. safer” path alternatives) (Lavin, 2015).
  • Probabilistic Multi-objective Decision Support: MOLAO* and MOLRTDP serve as foundational algorithms for policy synthesis in stochastic environments, facilitating approximate or explicit management of risk, cost, and competing objectives (Chen et al., 2023).
  • Planning under Environmental Uncertainty: U-Plan exemplifies multi-plan synthesis for domains with ambiguous or conflicting evidence, guaranteeing coverage of plausible world states with modality for agile sense-act adaptation (Mansell et al., 2013).

7. Limitations, Open Challenges, and Future Directions

While multi-plan search is powerful, it has notable limitations:

  • Computational Bottlenecks: Scaling to large objectives or search depths can cause intractable candidate blowup, requiring pruning, compact coverage-set representation, and high-quality heuristic guidance (Lavin, 2015, Chen et al., 2023).
  • Lack of Execution Feedback in Some Domains: For execution-free formulations (e.g., MutaGReP), plan scoring may rely on weak proxies (symbol diversity, LLM judgment) rather than ground-truth execution, impacting plan reliability (Khan et al., 21 Feb 2025).
  • Test-time Compute Trade-offs: Increased budgets or branching factors reliably improve plan coverage but at additional runtime cost (e.g., more LLM calls or backups), introducing a practical ceiling on plan set size (Khan et al., 21 Feb 2025, Chen et al., 2023).
  • Extensions: Promising directions include approximate or ε-coverage-set search (bounding plan-set size), incorporation of richer constraints (e.g., LTL for temporal logic), and extension to partially observable domains (POMDPs), as well as interactive user-guided plan selection (Chen et al., 2023).

Multi-plan search remains essential in domains requiring robustness, trade-off analysis, uncertainty management, and compact high-precision context for large-scale LLM-augmented reasoning and planning systems (Khan et al., 21 Feb 2025, Mansell et al., 2013, Lavin, 2015, Chen et al., 2023).

Topic to Video (Beta)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Multi-Plan Search.