Sampling-Based Predictive Control

Updated 18 January 2026

Sampling-Based Predictive Control is a real-time model-based approach that generates candidate control sequences by simulating trajectories under system dynamics.
It leverages methods like MPPI, CEM, and deterministic sampling to address nonlinear dynamics, non-smooth cost functions, and contact-rich events.
Recent advances focus on enhancing sample efficiency, reducing chattering, and enabling parallel execution for safety-critical and resource-constrained applications.

Sampling-based Predictive Control is a class of real-time model-based control methodologies that solve optimal control problems by simulating an ensemble of stochastic or deterministic action sequences, evaluating candidate trajectories under system dynamics and a user-specified objective, and employing a selection or aggregation step to produce the next control input to execute. These methods are characterized by their ability to accommodate highly nonlinear dynamics, non-smooth cost functions, contact and hybrid events, and are naturally suited to parallelization on contemporary computational hardware. Fundamental approaches include the Predictive Sampling algorithm, Model Predictive Path Integral (MPPI) control and the Cross-Entropy Method (CEM), each with specific variants and recent extensions to enable greater smoothness, sample efficiency, distributed optimization, and effective practical deployment in robotics and other resource-constrained, safety-critical domains.

1. Mathematical Foundations and Core Algorithms

Sampling-based predictive control formalizes the finite-horizon optimal control problem as

$\min_{u_{0:T-1}} J(x_0,\, u_{0:T-1}) = \sum_{t=0}^{T-1} c(x_t,\,u_t)$

s.t. $x_{t+1} = f(x_t,\, u_t)$ , where $c(\cdot)$ is a running cost, $f(\cdot)$ is the system dynamics (potentially nonlinear), and the horizon $T$ is typically short to promote real-time receding-horizon application (Howell et al., 2022).

Rather than employing gradient-based optimization, sampling-based methods draw $N$ candidate control sequences (full open-loop trajectories, or, more commonly, parameterized by a lower-dimensional spline or knot vector) from a proposal distribution. For each sequence, the predicted trajectory is simulated using a (sometimes physics-based) system model, and the accumulated cost is recorded. A selection rule—such as elite selection, importance weighting, or simple minimization—is then used to update the controller's parameterization.

Canonical algorithm variants:

Predictive Sampling (PS): Zero-order, greedy selection. At each step, N–1 perturbed control policies (typically spline parameters) are sampled from an isotropic Gaussian centered at the current nominal. All candidates are simulated in parallel and the lowest-cost trajectory's parameterization replaces the nominal. Only one sampling iteration is performed per control step (Howell et al., 2022).
MPPI: All candidate noises are sampled, rollouts are simulated, and each trajectory is weighted exponentially by its total cost. The next control is taken as the weighted average of the perturbations (Walker et al., 7 Jan 2026, Sacks et al., 2022).
CEM: Elite selection—top-K sequences with lowest costs are used to refit the mean and covariance of the proposal Gaussian, and the process is iterated.
Deterministic Sampling (dsMPPI/dsCEM): Uses fixed, low-discrepancy sample sets rather than i.i.d. randomness to reduce control chattering and increase smoothness (Walker et al., 7 Jan 2026).
Spline/Basis Parameterizations: Control trajectories are encoded as low-dimensional parameter vectors (e.g. cubic or Hermite splines) instead of dense sequences (Schramm et al., 24 Nov 2025, Howell et al., 2022, Tao et al., 4 Jan 2026).

The general control update—abstracting over the choice of weighting/selection—is: $\theta^{\text{new}} = \mathcal{A}\left(\theta^{\text{old}},\, \{\theta^{(i)}, J^{(i)}\}_{i=1}^N\right)$ where $\mathcal{A}$ denotes the update mechanism (greedy/min-cost, weighted averaging, etc.).

2. Algorithmic Structure and Implementation Practices

Sampling-based predictive controllers universally employ the following loop at each control tick:

State Estimation: Read latest system state $x$ , timestamp $\tau$ as needed.
Trajectory Sampling and Rollout:
- Sample N candidate parameter vectors (either entire control sequences or spline parameters) from a noise process, often $x_{t+1} = f(x_t,\, u_t)$ 0, or from a learned/generative proposal.
- For each candidate, simulate the trajectory from the current state and accumulate cost over the horizon.
Update/Selection:
- Replace nominal with best-performing sample (Predictive Sampling) (Howell et al., 2022), or weighted update (MPPI/CEM variants) (Walker et al., 7 Jan 2026).
- Advance the time window in the parameterization: discard the first element, warm-start by shifting.
Apply First Control: Extract and send the first control from the new nominal policy to the plant or hardware.

Illustrative pseudocode (Predictive Sampling) (Howell et al., 2022):

$x_{t+1} = f(x_t,\, u_t)$ 1

Parallelization: All rollouts are embarrassingly parallel, naturally mapped to multi-core CPUs or GPUs (Li et al., 20 Jun 2025, Howell et al., 2022, Pezzato et al., 2023).

Parameterization: Use of splines (cubic, Hermite, Bézier, piecewise-linear) reduces optimization dimensionality, improving sample efficiency and temporal smoothness (Schramm et al., 24 Nov 2025, Howell et al., 2022, Tao et al., 4 Jan 2026).

3. Theoretical Properties: Expressivity, Guarantees, and Limitations

Sampling-based predictive control inherits several desirable theoretical and practical properties:

Expressivity: It is compatible with arbitrary nonlinear dynamics (even discontinuous/contact-dynamics), arbitrary cost functions, state/input constraints (enforced as hard constraints or cost penalties), and risk-sensitive costs (Howell et al., 2022, Schramm et al., 24 Nov 2025).
Parallelizability: Each candidate evaluation is independent, allowing wall-clock control frequency to scale with the number of cores (Bobiti et al., 2017, Li et al., 20 Jun 2025).
Recursive Feasibility: Provided the cost function and state/input constraints are properly encoded, and a stabilizing terminal set exists, suboptimal sampling-based MPC with greedy, feasible updates guarantees recursive feasibility and closed-loop stability under standard assumptions (Bobiti et al., 2017).
Adaptivity: The one-step update design ensures rapid adaptation to abrupt state changes. The planner "surfs" the receding-horizon cost landscape, favoring fast, greedy, state-locked control over heavy, convergent optimization (Howell et al., 2022).
Robustness to Discontinuities: Derivative-free methods are insensitive to non-smooth contacts; fine-grained shooting planners are empirically more reliable on tasks like dexterous or contact-rich manipulation (Hess et al., 2024, Howell et al., 2022).

Limitations:

Sample Inefficiency: Basic variants require relatively large sample budgets to avoid degraded performance, especially in high-dimensional or long-horizon tasks (Howell et al., 2022, Schramm et al., 24 Nov 2025). Adaptive or learned proposals partially mitigate this (Sacks et al., 2022, Brudermüller et al., 16 Oct 2025).
Myopia: The receding horizon and single-iteration focus can be severely myopic, failing on tasks requiring long-horizon credit assignment (Howell et al., 2022).
Noise and Chattering: Random i.i.d. sampling can induce non-smooth, high-frequency control, necessitating low-pass filtering or structured sampling (Kicki, 13 Mar 2025, Walker et al., 7 Jan 2026, Tao et al., 4 Jan 2026).
Warm-Starting and Covariance Tuning: Performance is sensitive to noise scale, parameterization, and how effectively the nominal is warm-started each cycle (Howell et al., 2022, Schramm et al., 24 Nov 2025).
No Online Distribution Adaptation (Basic PS): Unlike CEM or MPPI, Predictive Sampling does not anneal noise (σ is fixed) and does not maintain full covariance adaptation or elite-set memory (Howell et al., 2022).

4. Recent Advances and Variants

Several technical advances extend classical sampling-based predictive control:

Variant	Key Feature	Representative Reference
Deterministic Sampling (dsMPPI, dsCEM)	Low-discrepancy deterministic sample sets for smooth control	(Walker et al., 7 Jan 2026)
Low-Pass Sampling (LP-MPPI)	Explicit filtering of injected noise for bandwidth control	(Kicki, 13 Mar 2025)
Learned Proposal Distributions	Normalizing flows and imitation or flow-matching training	(Sacks et al., 2022, Brudermüller et al., 16 Oct 2025)
Ancillary-Controller Fusion (Biased-MPPI)	Mixture-based sampling with explicit behavior primitives	(Trevisan et al., 2024)
Data-consistent stochastic predictive control	Guarantees under data-driven system identification	(Teutsch et al., 2024, Teutsch et al., 2024)
Multi-Agent Distributed Sampling MPC	Distributed ADMM for coordinated multi-agent systems	(Wang et al., 2022)
Ising-MPPI	Binary/discrete sampling mapped to Ising machines or FPGA	(Werthen-Brabants et al., 17 Dec 2025)

Deterministic sampling eliminates chattering by using precomputed low-discrepancy samples approximating the Gaussian distribution, substantially improving smoothness while matching cost performance. Low-pass sampling directly controls exploration bandwidth by digitally filtering the injected trajectories, providing more actuator-friendly control trajectories and reduced chattering. Learned sampling distributions—using normalizing flows or amortized generative models—improve sample efficiency and horizon fidelity, enabling classical methods to operate with orders-of-magnitude fewer model samples (Brudermüller et al., 16 Oct 2025, Sacks et al., 2022). Fusing classical or learned controllers as mixture components in the sampling distribution (as in Biased-MPPI) regularizes behavior and robustly escapes local minima, critical in dynamic or multi-modal domains (Trevisan et al., 2024).

5. Empirical Benchmarks and Practical Applications

Sampling-based predictive control is extensively deployed in simulated and real robotic platforms across manipulation, locomotion, vehicle racing, and visual servoing:

MJPC (MuJoCo MPC) (Howell et al., 2022): In in-hand manipulation, quadruped pose control, and humanoid stand-up, Predictive Sampling with as few as N=10 rollouts per tick achieves cost and robustness on par with iLQG and gradient-based planners, with per-tick update times as low as 1–20 ms on modern CPUs.
Judo (Li et al., 20 Jun 2025): Standardized, benchmarked Python framework with multi-algorithm support; achieves real-time (≤10 ms) performance even for complex manipulation tasks on consumer hardware.
Reference-Free Locomotion (Schramm et al., 24 Nov 2025): Sampling-based joint-space Hermite spline parameterization yields emergent gaits, jumps, and handstands with 30–70 samples per tick and per-update times under 30 ms (CPU).
Smooth/Low-pass Sampling (Walker et al., 7 Jan 2026, Kicki, 13 Mar 2025): On standard control benchmarks (cart-pole, truck backing, halfcheetah, etc.), deterministic and low-pass sampling consistently yield higher trajectory smoothness for equal or lower cost.
Contact-rich and Real-World Manipulation (Hess et al., 2024, Pezzato et al., 2023, Brudermüller et al., 16 Oct 2025): These methods outperform RL baselines in manipulation without lengthy training and retain sim-to-real transfer robustness due to their model-based flexibility.

6. Best Practices, Open Tooling, and Future Directions

Best practices (Predictive Sampling and related methods) (Howell et al., 2022, Schramm et al., 24 Nov 2025):

Use a low-dimensional (P≪T) spline parameterization to boost sample efficiency.
Tune sample count N to balance parallel compute availability and exploration (~10–70 for PS, more for CEM/MPPI).
Tune perturbation scale σ to explore but not destabilize; avoid overshoot.
Warm-start every timestep and recede the nominal by one interval.
For basic PS, restrict to a single sample update per real-time tick; more iterations offer little benefit.

Interactive implementation frameworks (MJPC, Judo): Provide browser-based GUIs, asynchronous middleware for sim-to-real transfer, and rapid tuning cycles (Li et al., 20 Jun 2025, Howell et al., 2022).

Research frontiers:

Amortized and generative proposal learning: Incorporation of learned normalizing flows and flow-matching proposals to improve sample efficiency and horizon fidelity (Sacks et al., 2022, Brudermüller et al., 16 Oct 2025).
Smooth, hardware-aware exploration: Deterministic or spectrally-constrained sampling for actuators sensitive to high-frequency inputs (Walker et al., 7 Jan 2026, Kicki, 13 Mar 2025).
Distributed/multi-agent control: Consensus-ADMM with sampling-based optimization for scalability (Wang et al., 2022).
Specialized hardware acceleration (Ising, FPGA): Binary-encoded control for hardware p-bit arrays, promising order-of-magnitude real-time gains (Werthen-Brabants et al., 17 Dec 2025).
Data-driven, adaptive constraint satisfaction: Sampling-based approaches for chance-constrained stochastic or data-driven MPC with online parameter set adaptation (Teutsch et al., 2024, Teutsch et al., 2024).

Limitations and current gaps: Fundamental sample complexity barriers for high-dimensional or long-horizon tasks, challenge in bridging myopic receding-horizon plans and global optimality, and potential subpar performance under strong model mismatch in physical systems (Howell et al., 2022, Hess et al., 2024).

7. Summary Table: Core Algorithms and Characteristics

Algorithm	Selection Mechanism	Parallelizable	Sample Smoothing	Hardware Demonstration	Sample Efficiency Enhancement
Predictive Sampling	Greedy minimum cost	Yes	No (vanilla PS)	MuJoCo (MJPC) (Howell et al., 2022)	Spline, learned proposals, deterministic
MPPI	Exponential weighting	Yes	DS/LP variants	IsaacGym, F1TENTH, Go2 (Schramm et al., 24 Nov 2025)	dsMPPI (Walker et al., 7 Jan 2026), LP-MPPI (Kicki, 13 Mar 2025)
CEM	Hard elite mean	Yes	No	Judo, IsaacGym	Generative proposals (Brudermüller et al., 16 Oct 2025)
dsMPPI/dsCEM	Exponential/hard, DS	Yes	Deterministic	Cart-pole, Truck-Backing	Reduced chattering, smoothness
Biased-MPPI	Weighted fusion	Yes	Arbitrary mixture	Jackal, multi-agent boats	Ancillary controller injection
NFMPC	Latent update, flows	Yes	Latent flows	Panda, Holonomic 2D (Sacks et al., 2022)	End-to-end learning

Sampling-based predictive control encompasses a broad and rapidly evolving family of methods, exhibiting remarkable versatility and competitive performance in both simulated and real-world robotic domains. Its parallelizable architecture, robustness to non-smoothness, and compatibility with learned models and proposals position it as a cornerstone of modern MPC practice. Continued progress is expected in few-sample efficiency, control regularity, scalable deployment, and systematic integration with data-driven and learning-based methods.