Feasibility-Guided Tree Search

Updated 10 December 2025

Feasibility-guided tree search is a family of algorithms that integrates explicit feasibility checks and learned guidance to navigate complex decision trees in constrained spaces.
It leverages domain knowledge and constraint oracles to prune infeasible branches while prioritizing promising candidates, enhancing efficiency in optimization tasks.
Empirical studies in beam orientation optimization and task-motion planning demonstrate significant gains over baseline methods in both solution quality and computational time.

Feasibility-guided tree search comprises a family of decision-making algorithms that integrate explicit feasibility checks and structured guidance into tree-based search frameworks to efficiently solve complex combinatorial optimization and planning problems. These algorithms leverage domain knowledge, learned priors, or constraint oracles to prune infeasible branches and prioritize promising candidates, yielding improved search efficiency and solution quality compared to uniform or naive sampling in high-dimensional and constrained spaces. Empirical studies in beam orientation optimization (BOO) for radiation therapy and in integrated task and motion planning (TAMP) for robotics have demonstrated the advantages of feasibility-guided tree search approaches over baseline heuristics and uniform tree search (Sadeghnejad-Barkousaraie et al., 2020, Ren et al., 2021).

1. Problem Domains and Formalization

Feasibility-guided tree search has been instantiated in multiple domains that share the challenges of evaluating large combinatorial spaces under stringent feasibility or optimality constraints.

Beam Orientation Optimization: In BOO, the problem is to select a set $B \subset \{1, \dots, M\}$ of $N$ beam angles from $M$ candidates to optimize dose distribution. The problem decomposes into beam-orientation selection and fluence map optimization (FMO). The FMO subproblem for a fixed $B$ is:

$\min_x V(B, x) = \frac{1}{2} \sum_{s \in S} w_s^2 \| D_s(B)x - p_s \|_2^2 \quad \text{subject to } x \geq 0,$

where $S$ is the set of structures, $w_s$ are planning weights, $D_s(B)$ is the dose-influence matrix, and $p_s$ are dose prescriptions. The outer problem seeks $B$ such that $V^*(B) = V(B, x^*(B))$ is minimized, where $x^*(B)$ solves the FMO (Sadeghnejad-Barkousaraie et al., 2020).

Task and Motion Planning for Robotics: In TAMP, an extended decision tree is constructed over symbolic skeletons (sequences of high-level task actions) and continuous variable bindings (motion primitives, grasps, object placements). A node encodes either a skeleton or a binding for a symbolic variable. Feasibility constraints (e.g., collision-free, inverse kinematics feasible) are black-box streams $\alpha: (u) \mapsto y$ with constraints $f_j(u, y) \geq 0$ , $g_\ell(u, y) = 0$ (Ren et al., 2021).

2. Tree Search Architecture and Algorithmic Components

All feasibility-guided tree search approaches share a productive synergy of four core phases: selection, expansion, simulation, and backpropagation.

Selection: From the root, recursively select child actions or bindings maximizing a PUCT- or UCT-style score to balance exploitation of high-value nodes and exploration of new branches. In BOO GTS, the selection score is:

$a^* = \arg\max_{a \not\in B_s} Q(s, a) + c \cdot p(s, a) \cdot \frac{\sqrt{N(s)}}{1 + N(s, a)},$

where $Q(s, a)$ is mean reward, $p(s, a)$ is the DNN prior, $N(s)$ is visit count, and $c$ is a tunable constant (Sadeghnejad-Barkousaraie et al., 2020).

Expansion: Upon encountering an unvisited child, the node is expanded, with feasibility checks applied immediately for streams (in TAMP), or node statistics initialized (in BOO). The DNN prior or domain constraints are computed for the new node.
Simulation (Rollout): Extend from the new node by greedily or randomly selecting further actions or bindings, sampling until a terminal (complete plan) or an infeasibility is found. In TAMP, this involves repeated stream calls to satisfy geometric constraints.
Backpropagation: The reward (objective improvement, cost, or success signal) is propagated upwards to update node and edge statistics. In BOO, the cost $V(B)$ is mapped to a normalized reward and aggregated along the path (Sadeghnejad-Barkousaraie et al., 2020); in TAMP, reward is computed as a function of number of successful bindings, motion cost, and terminal success (Ren et al., 2021).

3. Learning and Guidance Mechanisms

Domain knowledge is injected into tree search to guide exploration towards feasible or high-quality solutions.

Supervised DNN Priors for BOO: A deep neural network, pretrained on column-generation (CG) fitness targets, predicts beam fitness vectors for input patient geometry, current beam set, and weights. At CG iteration $k$ , for candidate beam $i$ , fitness $f_i = -\nabla_i L(B_k)$ quantifies the benefit of adding $i$ . The DNN is trained to predict $f^{CG}(B_k) \in \mathbb{R}^M$ by minimizing mean-squared error, using 3D convolutional layers for anatomy, concatenation with beams and weights, and fully connected output for $M$ beams. The DNN achieves sub-2s inference per patient (Sadeghnejad-Barkousaraie et al., 2020).

Skeleton and Binding Exploration in TAMP: The TAMP framework employs a top- $k$ skeleton generator to produce a candidate set of symbolic plans ("skeleton space"), each inducing a horizon $H$ of optimistic variables to bind. Progressive widening with UCT at each node adaptively balances exploration of new skeletons and exploitation of promising branches (Ren et al., 2021).

Feasibility Checks: In both domains, explicit feasibility checks are embedded: in BOO, clinical acceptance criteria or cost thresholds; in TAMP, satisfaction of geometric constraints via black-box streams.

4. Feasibility Criteria and Performance Metrics

Feasibility is enforced and evaluated using domain-specific metrics.

BOO (GTS) Feasibility and Metrics:

A beam set $B$ is feasible if planning target volume (PTV) coverage is within 1% of prescription and all organ-at-risk (OAR) metrics satisfy clinical thresholds.
Metrics recorded include PTV $D_{98}$ , $D_{99}$ , $D_2$ , Paddick Conformity Index, high-dose spillage $R_{50}$ , and OAR mean/max doses normalized to prescription.
In practice, any plan with objective $V(B)$ lower than the CG baseline is accepted (Sadeghnejad-Barkousaraie et al., 2020).

TAMP (eTAMP) Feasibility and Reward:

Feasibility at each node is determined by streams; infeasibility terminates the branch.
Final reward for a plan is a weighted sum of successful bindings, inverse motion cost, and terminal success (Ren et al., 2021).

5. Empirical Performance and Comparisons

Extensive empirical evaluation demonstrates the efficacy of feasibility-guided tree search.

Method	Success Rate (BOO)	Avg. Improvement ("Distance", BOO)	Time to Outperform Baseline (BOO)	Robotics Tasks (TAMP)
GTS (BOO)	79% (beats CG ≥1/5)	+2.48% ± 1.87%	195 s (median), 237 s (mean±σ)	N/A
Guided Search	1.79% ± 1.64%	227/268±132 s	N/A
RTS/Random	0.67–0.81% ± 1.05–1.15%	337–406/371–213/204 s	N/A
eTAMP (TAMP)	N/A	N/A	N/A	>80% success (large motion)
Adaptive/PDDLStream	N/A	N/A	N/A	<40% or ~90% (challenging)

In BOO, GTS outperforms CG and all baselines in both objective improvement and time-to-first-improvement under fixed 1000 s time budgets. Notable results include a 4.9% mean dose reduction to rectum and 2.6% to body relative to CG, with slightly increased bladder mean dose but lower max dose. In TAMP, eTAMP achieves >80% success in challenging high-DOF and infeasible-skeleton tasks, outperforming state-of-the-art PDDLStream Adaptive baseline (often <40% success) by explicitly guiding skeleton exploration and binding selection (Sadeghnejad-Barkousaraie et al., 2020, Ren et al., 2021).

Statistical analyses confirm the superiority of guided tree search under objective and clinical metrics at $p < 0.01$ significance.

6. Theoretical Properties and Complexity

Feasibility-guided tree search architectures offer strong completeness and optimality guarantees under mild assumptions.

Probabilistic completeness: If the underlying plan or skeleton generator is complete and the skeleton pool grows unbounded ( $k \to \infty$ ), any feasible plan or skeleton will eventually be considered (Ren et al., 2021).
Consistency: With sufficient search time ( $t_{ts} \to \infty$ ), value estimates at each node converge to the true subtree optimum under UCT and progressive widening.
Asymptotic optimality: Given infinite time and candidate generation, the algorithm will recover the optimal plan for the reward structure.
Complexity: For $k$ skeletons, horizon $H$ , and effective branching factor $b$ , time complexity is $O(t_{ts} \cdot \log(k\,b^H))$ with practical control of $b$ via progressive widening.

7. Extensions and Cross-Domain Connections

Feasibility-guided tree search principles generalize across domains where low-probability feasible solutions must be efficiently isolated within high-dimensional, combinatorially structured search spaces subject to black-box constraints or complex optimality metrics.

The modular use of learned priors (as in BOO) and progressive skeleton exploration (as in TAMP) may hybridize in novel domains.
Contextual feasibility checks, constraint propagation, and reward structures can be adapted to domains including automated theorem proving, high-throughput material design, and integrated scheduling and control problems.
The integration of neural predictors, bandit-based skeleton selection, and domain-specific feasibility oracles under tree search meta-architectures define a general methodology for efficiently solving constraint-rich, long-horizon optimization problems (Sadeghnejad-Barkousaraie et al., 2020, Ren et al., 2021).