AdaHorizon: Uncertainty-Driven Adaptive Planning

Updated 4 December 2025

AdaHorizon is an uncertainty-driven adaptive planning algorithm that dynamically adjusts the execution horizon to balance computational efficiency and performance.
It uses predictive uncertainty metrics—such as ensemble variance and MAD between action predictions—to decide when to replan and avoid compounding errors.
Empirical results demonstrate up to 90% reductions in model calls and significant performance gains in both offline reinforcement learning and vision-language-action robotics.

Adaptive-Horizon Ensembler (AdaHorizon) is a family of uncertainty-driven adaptive planning algorithms designed to maximize both computational efficiency and task performance in sequential decision making. AdaHorizon dynamically selects the number of open-loop actions to execute before replanning, leveraging model and prediction uncertainty to minimize unnecessary computation and mitigate open-loop degradation. Its instantiations span offline reinforcement learning with generative models (Jutras-Dubé et al., 2 Aug 2024) and high-throughput vision-language-action robotics (Chopra et al., 7 Nov 2025), where it substantially reduces planning overhead without compromising outcome quality.

1. Core Principles and Problem Setting

AdaHorizon addresses the computational limitations inherent in planning with complex generative models or large transformer-based action models. Standard continuous replanning approaches offer strong correction capabilities but incur expensive model queries at every step, yielding high computational cost. Conversely, fixed-horizon open-loop execution achieves speed but suffers from compounding errors as sensory uncertainty accumulates.

The formal substrate is the Markov Decision Process (MDP) with state space $\mathcal{S}$ , action space $\mathcal{A}$ , and reward $R: \mathcal{S} \times \mathcal{A} \rightarrow \mathbb{R}$ . In offline RL (Jutras-Dubé et al., 2 Aug 2024), a fixed dataset $\mathcal{D} = \{\tau^{i}\}_{i=1}^{N}$ is provided; no further environment interaction is permitted. The agent seeks a policy $\pi: \mathcal{S} \rightarrow \mathcal{A}$ that maximizes expected reward.

Within vision-language-action (VLA) planning (Chopra et al., 7 Nov 2025), the challenge is to robustly sequence action chunks in high-dimensional, multimodal state spaces, minimizing intervention frequency under nonstationary uncertainty.

2. Uncertainty Quantification and Adaptive Horizon Control

The distinguishing feature of AdaHorizon is the explicit, stepwise measurement of predictive uncertainty to trigger replanning. In generative RL frameworks (Jutras-Dubé et al., 2 Aug 2024), this uncertainty $u_t$ is estimated from a deep ensemble of $M$ inverse dynamics models $f_{\phi_m}$ , trained on the same experience buffer with different random seeds. Each model returns a mean action prediction $\mu_{\phi_m}(x_t)$ and, if NLL-trained, its predictive variance $\sigma^2_{\phi_m}(x_t)$ . Total predictive uncertainty is decomposed as:

$u_t = \frac{1}{M} \sum_{m=1}^M \sigma^2_{\phi_m}(x_t) + \mathrm{Var}_m[\mu_{\phi_m}(x_t)]$

where the first term is the mean aleatoric uncertainty and the second the epistemic ensemble variance. For MSE-only ensembles, the uncertainty simplifies to $\mathrm{Var}_m[f_{\phi_m}(x_t)]$ .

In robot VLA systems (Chopra et al., 7 Nov 2025), AdaHorizon fuses the outputs of continuous and discrete action prediction heads, computing a mean absolute difference (MAD) metric for each chunk index $t$ :

$\mathrm{mad}_t = \frac{1}{D} \sum_{d=1}^{D} |a^c_{t,d} - a^d_{t,d}|$

where $a^c_{t,d}$ and $a^d_{t,d}$ denote the $d$ -th dimensions of the continuous and discrete action predictions, respectively. This MAD is used as an actionable proxy for disagreement-induced uncertainty.

3. Adaptive-Horizon Execution Logic

The core mechanism is a thresholding control law that adaptively shortens or extends the planning horizon based on moment-to-moment uncertainty estimates.

Offline RL / Generative Model Setting (Jutras-Dubé et al., 2 Aug 2024):

From the current state $s_t$ , generate a long-horizon rollout $\hat{s}_{t+1:t+H} \sim p_{\theta}(\cdot | s_t)$ .
At each step $i$ up to $H$ , compute $(a_{t+i}, u_{t+i})$ .
Execute $a_{t+i}$ as long as $u_{t+i} < \delta$ and $i < H$ . If $u_{t+i} \geq \delta$ (threshold), trigger replanning.
Empirically, $\delta$ is tuned to balance open-loop degradation against computational savings, typically admitting only $\sim$ 10% of steps requiring new rollouts.

Vision-Language-Action Setting (Chopra et al., 7 Nov 2025):

For each chunk of $K$ predicted actions, enforce a minimum open-loop segment $m_{\text{min}}$ (e.g. 4).
Replanning is requested if $\mathrm{mad}_t > \tau_{\text{replan}}$ for some $t \leq m_{\text{min}}$ . Repeated short-horizon requests count towards an "abort-to-full-chunk" safeguard, activating a global reset if task ambiguity is high.
Beyond $m_{\text{min}}$ , any chunk index $t$ where $\mathrm{mad}_t \geq \tau_{\text{trunc}}$ truncates the current chunk, adaptively setting the execution horizon $H$ .

4. Stepwise Algorithmic Structure

The AdaHorizon policy can be summarized as follows:

Generative RL Implementation (Jutras-Dubé et al., 2 Aug 2024):

t = 0
observe s_t
while not done:
    # Plan fresh horizon H
    s_pred = p_theta(s_t)
    i = 0
    while i < H:
        x = (s_t, s_pred[i+1])
        a_t = mean([f_phi_m(x) for m in M])
        u_t = var([f_phi_m(x) for m in M]) + mean([sigma2_phi_m(x) for m in M]) if NLL-trained else var([...])
        if u_t < delta:
            execute a_t
            t += 1; i += 1
        else:
            break

VLA Chunking Implementation (Chopra et al., 7 Nov 2025):

For $t$ in $1\dots K$ , compute $\mathrm{mad}_t$ .
If $\exists t\leq m_{\text{min}}$ with $\mathrm{mad}_t > \tau_{\text{replan}}$ , increment replan counters.
If abort-to-full-chunk criteria met, return full chunk.
Build truncation mask $mask_t = \mathbb{I}(\mathrm{mad}_t < \tau_{\text{trunc}})$ .
Set $H$ as the largest $t$ s.t. all $mask_1...mask_H = 1$ and $H \geq m_{\text{min}}$ .
Return first $H$ discrete actions for execution.

5. Hyperparameterization and Tuning

AdaHorizon’s effectiveness depends on judicious threshold setting:

Parameter	Role and Typical Value
$H, K$	Horizon/chunk size per model call (e.g., $K=8$ )
$\delta$	Uncertainty cutoff (RL setting, tuned per domain)
$\tau_{\text{replan}}$	High MAD threshold (early chunk), $>\tau_{\text{trunc}}$
$\tau_{\text{trunc}}$	MAD threshold for open-loop truncation
$m_{\text{min}}$	Minimum open-loop segment (prevents small chunks)
$C_{\max}, C_{\text{task}}$	Counters for abort-to-full-chunk logic

In practice, $\tau_{\text{replan}} > \tau_{\text{trunc}}$ to avoid spurious early replans, and $m_{\text{min}}$ is fixed to balance latency and robustness. Thresholds are tuned on held-out validation domains.

6. Empirical Performance and Computational Impact

AdaHorizon achieves its principal goal of vastly reducing expensive model queries with minimal or no fidelity loss.

On OpenAI Gym (Hopper, Walker, etc.), AdaHorizon reduces model (DDPM) calls by up to $>90\%$ , e.g., saving $91.1\%$ neural forward evaluations on Hopper-Medium whilst improving normalized return from $49.9$ (baseline) to $62.1$.
Wall-clock speedup: $>130\times$ over continuous replanning baselines.
Return drop is typically $\leq2\%$ ; in some cases, performance marginally exceeds stepwise replanning due to reduced compounding of model-induced drift.

On LIBERO Spatial, AdaHorizon attains $96.8\%$ success, a $+1.6\%$ absolute improvement over the strongest ensembler baseline.
On the full LIBERO suite, AdaHorizon yields a $+0.8\%$ uplift in average success rate.
Real-world pick-and-place: $+49\%$ in-distribution, $+34.9\%$ out-of-distribution improvement versus prior methods.
The ensembler’s computational overhead is negligible: $<1$ ms per chunk, maintaining $>50$ Hz overall inference rates.

7. Limitations, Comparative Context, and Extensions

AdaHorizon is subject to several domain- and method-specific constraints:

No formal worst-case performance bounds; the threshold parameters must be tuned empirically.
Limitations in representational capacity for complex real-world dynamics or high-dimensional sensory streams; robustness to domain shift is not addressed.
The MAD metric is unnormalized and may be sensitive to action scaling across dimensions, requiring manual weighting or further refinement.

Potential extensions include:

Learned or Bayesian threshold selection to replace fixed cutoffs.
Incorporation of cost-to-go predictors or miniature MPCs for more globally optimal horizon selection.
Dimension-weighted disagreement metrics, particularly important when combining translational, rotational, and gripper actions in robotics.
Multi-scale chunking or hierarchical horizon adaptation for flexible control granularity.

Comparison with alternative replanning regimes confirms the efficacy of AdaHorizon: continuous replanning guarantees maximal responsiveness but minimum efficiency; static-horizon execution optimizes speed at the expense of potential catastrophic open-loop drift. AdaHorizon occupies an empirically validated intermediate regime, yielding up to 95% savings in model evaluation with performance competitive or superior to the strongest stepwise baselines (Jutras-Dubé et al., 2 Aug 2024, Chopra et al., 7 Nov 2025).

PDF Markdown Chat (Pro)

References (2)

Adaptive Planning with Generative Models under Uncertainty (2024)

EveryDayVLA: A Vision-Language-Action Model for Affordable Robotic Manipulation (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Adaptive-Horizon Ensembler (AdaHorizon).

AdaHorizon: Uncertainty-Driven Adaptive Planning

1. Core Principles and Problem Setting

2. Uncertainty Quantification and Adaptive Horizon Control

3. Adaptive-Horizon Execution Logic

4. Stepwise Algorithmic Structure

5. Hyperparameterization and Tuning

6. Empirical Performance and Computational Impact

Key Results in RL Planning (Jutras-Dubé et al., 2 Aug 2024):

Key Results in VLA Robotics (Chopra et al., 7 Nov 2025):

7. Limitations, Comparative Context, and Extensions

Whiteboard

Follow Topic

Continue Learning

AdaHorizon: Uncertainty-Driven Adaptive Planning

1. Core Principles and Problem Setting

2. Uncertainty Quantification and Adaptive Horizon Control

3. Adaptive-Horizon Execution Logic

4. Stepwise Algorithmic Structure

5. Hyperparameterization and Tuning

6. Empirical Performance and Computational Impact

Key Results in RL Planning (Jutras-Dubé et al., 2 Aug 2024):

Key Results in VLA Robotics (Chopra et al., 7 Nov 2025):

7. Limitations, Comparative Context, and Extensions

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics