Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sampling-Based Predictive Control

Updated 18 January 2026
  • Sampling-Based Predictive Control is a real-time model-based approach that generates candidate control sequences by simulating trajectories under system dynamics.
  • It leverages methods like MPPI, CEM, and deterministic sampling to address nonlinear dynamics, non-smooth cost functions, and contact-rich events.
  • Recent advances focus on enhancing sample efficiency, reducing chattering, and enabling parallel execution for safety-critical and resource-constrained applications.

Sampling-based Predictive Control is a class of real-time model-based control methodologies that solve optimal control problems by simulating an ensemble of stochastic or deterministic action sequences, evaluating candidate trajectories under system dynamics and a user-specified objective, and employing a selection or aggregation step to produce the next control input to execute. These methods are characterized by their ability to accommodate highly nonlinear dynamics, non-smooth cost functions, contact and hybrid events, and are naturally suited to parallelization on contemporary computational hardware. Fundamental approaches include the Predictive Sampling algorithm, Model Predictive Path Integral (MPPI) control and the Cross-Entropy Method (CEM), each with specific variants and recent extensions to enable greater smoothness, sample efficiency, distributed optimization, and effective practical deployment in robotics and other resource-constrained, safety-critical domains.

1. Mathematical Foundations and Core Algorithms

Sampling-based predictive control formalizes the finite-horizon optimal control problem as

minu0:T1J(x0,u0:T1)=t=0T1c(xt,ut)\min_{u_{0:T-1}} J(x_0,\, u_{0:T-1}) = \sum_{t=0}^{T-1} c(x_t,\,u_t)

s.t. xt+1=f(xt,ut)x_{t+1} = f(x_t,\, u_t), where c()c(\cdot) is a running cost, f()f(\cdot) is the system dynamics (potentially nonlinear), and the horizon TT is typically short to promote real-time receding-horizon application (Howell et al., 2022).

Rather than employing gradient-based optimization, sampling-based methods draw NN candidate control sequences (full open-loop trajectories, or, more commonly, parameterized by a lower-dimensional spline or knot vector) from a proposal distribution. For each sequence, the predicted trajectory is simulated using a (sometimes physics-based) system model, and the accumulated cost is recorded. A selection rule—such as elite selection, importance weighting, or simple minimization—is then used to update the controller's parameterization.

Canonical algorithm variants:

  • Predictive Sampling (PS): Zero-order, greedy selection. At each step, N–1 perturbed control policies (typically spline parameters) are sampled from an isotropic Gaussian centered at the current nominal. All candidates are simulated in parallel and the lowest-cost trajectory's parameterization replaces the nominal. Only one sampling iteration is performed per control step (Howell et al., 2022).
  • MPPI: All candidate noises are sampled, rollouts are simulated, and each trajectory is weighted exponentially by its total cost. The next control is taken as the weighted average of the perturbations (Walker et al., 7 Jan 2026, Sacks et al., 2022).
  • CEM: Elite selection—top-K sequences with lowest costs are used to refit the mean and covariance of the proposal Gaussian, and the process is iterated.
  • Deterministic Sampling (dsMPPI/dsCEM): Uses fixed, low-discrepancy sample sets rather than i.i.d. randomness to reduce control chattering and increase smoothness (Walker et al., 7 Jan 2026).
  • Spline/Basis Parameterizations: Control trajectories are encoded as low-dimensional parameter vectors (e.g. cubic or Hermite splines) instead of dense sequences (Schramm et al., 24 Nov 2025, Howell et al., 2022, Tao et al., 4 Jan 2026).

The general control update—abstracting over the choice of weighting/selection—is: θnew=A(θold,{θ(i),J(i)}i=1N)\theta^{\text{new}} = \mathcal{A}\left(\theta^{\text{old}},\, \{\theta^{(i)}, J^{(i)}\}_{i=1}^N\right) where A\mathcal{A} denotes the update mechanism (greedy/min-cost, weighted averaging, etc.).

2. Algorithmic Structure and Implementation Practices

Sampling-based predictive controllers universally employ the following loop at each control tick:

  1. State Estimation: Read latest system state xx, timestamp τ\tau as needed.
  2. Trajectory Sampling and Rollout:
    • Sample N candidate parameter vectors (either entire control sequences or spline parameters) from a noise process, often N(θ,σ2I)\mathcal{N}(\theta, \sigma^2 I), or from a learned/generative proposal.
    • For each candidate, simulate the trajectory from the current state and accumulate cost over the horizon.
  3. Update/Selection:
    • Replace nominal with best-performing sample (Predictive Sampling) (Howell et al., 2022), or weighted update (MPPI/CEM variants) (Walker et al., 7 Jan 2026).
    • Advance the time window in the parameterization: discard the first element, warm-start by shifting.
  4. Apply First Control: Extract and send the first control from the new nominal policy to the plant or hardware.

Illustrative pseudocode (Predictive Sampling) (Howell et al., 2022):

1
2
3
4
5
6
7
8
9
For each control tick:
    1. Read current state x, current time τ
    2. For i = 0 .. N1 in parallel:
        θ̃  θ (unperturbed)
        θ̃  θ + ε, ε  𝒩(0, σ²I), i = 1..N1
        Simulate rollout with θ̃, accumulate totalCost
    3. θ  θ̃ⁱ* where i* = argmin totalCost
    4. Advance spline window, warm start
    5. Execute first control

Parallelization: All rollouts are embarrassingly parallel, naturally mapped to multi-core CPUs or GPUs (Li et al., 20 Jun 2025, Howell et al., 2022, Pezzato et al., 2023).

Parameterization: Use of splines (cubic, Hermite, Bézier, piecewise-linear) reduces optimization dimensionality, improving sample efficiency and temporal smoothness (Schramm et al., 24 Nov 2025, Howell et al., 2022, Tao et al., 4 Jan 2026).

3. Theoretical Properties: Expressivity, Guarantees, and Limitations

Sampling-based predictive control inherits several desirable theoretical and practical properties:

  • Expressivity: It is compatible with arbitrary nonlinear dynamics (even discontinuous/contact-dynamics), arbitrary cost functions, state/input constraints (enforced as hard constraints or cost penalties), and risk-sensitive costs (Howell et al., 2022, Schramm et al., 24 Nov 2025).
  • Parallelizability: Each candidate evaluation is independent, allowing wall-clock control frequency to scale with the number of cores (Bobiti et al., 2017, Li et al., 20 Jun 2025).
  • Recursive Feasibility: Provided the cost function and state/input constraints are properly encoded, and a stabilizing terminal set exists, suboptimal sampling-based MPC with greedy, feasible updates guarantees recursive feasibility and closed-loop stability under standard assumptions (Bobiti et al., 2017).
  • Adaptivity: The one-step update design ensures rapid adaptation to abrupt state changes. The planner "surfs" the receding-horizon cost landscape, favoring fast, greedy, state-locked control over heavy, convergent optimization (Howell et al., 2022).
  • Robustness to Discontinuities: Derivative-free methods are insensitive to non-smooth contacts; fine-grained shooting planners are empirically more reliable on tasks like dexterous or contact-rich manipulation (Hess et al., 2024, Howell et al., 2022).

Limitations:

4. Recent Advances and Variants

Several technical advances extend classical sampling-based predictive control:

Variant Key Feature Representative Reference
Deterministic Sampling (dsMPPI, dsCEM) Low-discrepancy deterministic sample sets for smooth control (Walker et al., 7 Jan 2026)
Low-Pass Sampling (LP-MPPI) Explicit filtering of injected noise for bandwidth control (Kicki, 13 Mar 2025)
Learned Proposal Distributions Normalizing flows and imitation or flow-matching training (Sacks et al., 2022, Brudermüller et al., 16 Oct 2025)
Ancillary-Controller Fusion (Biased-MPPI) Mixture-based sampling with explicit behavior primitives (Trevisan et al., 2024)
Data-consistent stochastic predictive control Guarantees under data-driven system identification (Teutsch et al., 2024, Teutsch et al., 2024)
Multi-Agent Distributed Sampling MPC Distributed ADMM for coordinated multi-agent systems (Wang et al., 2022)
Ising-MPPI Binary/discrete sampling mapped to Ising machines or FPGA (Werthen-Brabants et al., 17 Dec 2025)

Deterministic sampling eliminates chattering by using precomputed low-discrepancy samples approximating the Gaussian distribution, substantially improving smoothness while matching cost performance. Low-pass sampling directly controls exploration bandwidth by digitally filtering the injected trajectories, providing more actuator-friendly control trajectories and reduced chattering. Learned sampling distributions—using normalizing flows or amortized generative models—improve sample efficiency and horizon fidelity, enabling classical methods to operate with orders-of-magnitude fewer model samples (Brudermüller et al., 16 Oct 2025, Sacks et al., 2022). Fusing classical or learned controllers as mixture components in the sampling distribution (as in Biased-MPPI) regularizes behavior and robustly escapes local minima, critical in dynamic or multi-modal domains (Trevisan et al., 2024).

5. Empirical Benchmarks and Practical Applications

Sampling-based predictive control is extensively deployed in simulated and real robotic platforms across manipulation, locomotion, vehicle racing, and visual servoing:

  • MJPC (MuJoCo MPC) (Howell et al., 2022): In in-hand manipulation, quadruped pose control, and humanoid stand-up, Predictive Sampling with as few as N=10 rollouts per tick achieves cost and robustness on par with iLQG and gradient-based planners, with per-tick update times as low as 1–20 ms on modern CPUs.
  • Judo (Li et al., 20 Jun 2025): Standardized, benchmarked Python framework with multi-algorithm support; achieves real-time (≤10 ms) performance even for complex manipulation tasks on consumer hardware.
  • Reference-Free Locomotion (Schramm et al., 24 Nov 2025): Sampling-based joint-space Hermite spline parameterization yields emergent gaits, jumps, and handstands with 30–70 samples per tick and per-update times under 30 ms (CPU).
  • Smooth/Low-pass Sampling (Walker et al., 7 Jan 2026, Kicki, 13 Mar 2025): On standard control benchmarks (cart-pole, truck backing, halfcheetah, etc.), deterministic and low-pass sampling consistently yield higher trajectory smoothness for equal or lower cost.
  • Contact-rich and Real-World Manipulation (Hess et al., 2024, Pezzato et al., 2023, Brudermüller et al., 16 Oct 2025): These methods outperform RL baselines in manipulation without lengthy training and retain sim-to-real transfer robustness due to their model-based flexibility.

6. Best Practices, Open Tooling, and Future Directions

Best practices (Predictive Sampling and related methods) (Howell et al., 2022, Schramm et al., 24 Nov 2025):

  • Use a low-dimensional (P≪T) spline parameterization to boost sample efficiency.
  • Tune sample count N to balance parallel compute availability and exploration (~10–70 for PS, more for CEM/MPPI).
  • Tune perturbation scale σ to explore but not destabilize; avoid overshoot.
  • Warm-start every timestep and recede the nominal by one interval.
  • For basic PS, restrict to a single sample update per real-time tick; more iterations offer little benefit.

Interactive implementation frameworks (MJPC, Judo): Provide browser-based GUIs, asynchronous middleware for sim-to-real transfer, and rapid tuning cycles (Li et al., 20 Jun 2025, Howell et al., 2022).

Research frontiers:

Limitations and current gaps: Fundamental sample complexity barriers for high-dimensional or long-horizon tasks, challenge in bridging myopic receding-horizon plans and global optimality, and potential subpar performance under strong model mismatch in physical systems (Howell et al., 2022, Hess et al., 2024).

7. Summary Table: Core Algorithms and Characteristics

Algorithm Selection Mechanism Parallelizable Sample Smoothing Hardware Demonstration Sample Efficiency Enhancement
Predictive Sampling Greedy minimum cost Yes No (vanilla PS) MuJoCo (MJPC) (Howell et al., 2022) Spline, learned proposals, deterministic
MPPI Exponential weighting Yes DS/LP variants IsaacGym, F1TENTH, Go2 (Schramm et al., 24 Nov 2025) dsMPPI (Walker et al., 7 Jan 2026), LP-MPPI (Kicki, 13 Mar 2025)
CEM Hard elite mean Yes No Judo, IsaacGym Generative proposals (Brudermüller et al., 16 Oct 2025)
dsMPPI/dsCEM Exponential/hard, DS Yes Deterministic Cart-pole, Truck-Backing Reduced chattering, smoothness
Biased-MPPI Weighted fusion Yes Arbitrary mixture Jackal, multi-agent boats Ancillary controller injection
NFMPC Latent update, flows Yes Latent flows Panda, Holonomic 2D (Sacks et al., 2022) End-to-end learning

Sampling-based predictive control encompasses a broad and rapidly evolving family of methods, exhibiting remarkable versatility and competitive performance in both simulated and real-world robotic domains. Its parallelizable architecture, robustness to non-smoothness, and compatibility with learned models and proposals position it as a cornerstone of modern MPC practice. Continued progress is expected in few-sample efficiency, control regularity, scalable deployment, and systematic integration with data-driven and learning-based methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sampling-Based Predictive Control.