Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deterministic Sampling in MPC: dsMPPI/dsCEM

Updated 29 June 2026
  • Deterministic sampling is a method that replaces stochastic sampling with precomputed low-discrepancy samples to uniformly cover the proposal distribution.
  • It leverages offline LCD construction and runtime linear transformations to minimize variance and efficiently map samples into evolving control distributions.
  • These techniques produce smoother control trajectories and improved sample efficiency, making them viable for real-time and embedded MPC applications.

Deterministic sampling refers to a class of sampling-based optimal control and trajectory optimization methods in which the stochastic exploration step—typically involving Monte Carlo draws from a distribution—is replaced entirely by a fixed set of optimally chosen, low-discrepancy samples. In modern model predictive control (MPC) frameworks such as Model Predictive Path Integral (MPPI) and the Cross-Entropy Method for MPC (CEM-MPC), deterministic sampling gives rise to algorithms such as deterministic-sampling CEM (dsCEM) and deterministic-sampling MPPI (dsMPPI). These approaches achieve variance reduction, improved sample efficiency, and substantially smoother control policies compared to their randomly sampled counterparts, especially in nonlinear, time-correlated, or low-sample regimes (Walker et al., 7 Jan 2026, Walker et al., 7 Oct 2025).

1. Foundations and Motivation

Standard sampling-based MPC approaches such as MPPI and CEM-MPC estimate optimal control actions by simulating NN trajectories through the application of i.i.d. Gaussian noise to a nominal control sequence. However, random sampling exhibits several limitations:

  • Poor support coverage: Random samples tend to cluster and leave gaps, requiring large N for accurate expectation estimates ("low-discrepancy" is only achieved asymptotically).
  • Chattering and non-smoothness: Lack of temporal correlation between samples induces jerky control signals, which are undesirable in physical systems.
  • High computational burden: Large numbers of rollouts are necessary for reliable optimization, challenging real-time applications or embedded deployments (Walker et al., 7 Oct 2025).

Deterministic sampling addresses these limitations by precomputing a set of quadrature-like “optimal Dirac points”—called LCD samples (from Localized Cumulative Distributions)—which span the proposal distribution in a uniform, low-discrepancy manner. These samples are then systematically transformed at run-time into the current proposal distribution, enabling more reliable estimation and efficient use of computational resources (Walker et al., 7 Oct 2025, Walker et al., 7 Jan 2026).

2. Deterministic Sampling Construction and Transformation

The deterministic sample design process involves:

  • LCD Construction: For a reference density f(x)f(x) (typically N(0,I)N(0,I)), the localized cumulative distribution (LCD) is defined by:

Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)

A finite set of KK sample points {u~i}\{\tilde u_i\} is chosen to minimize a Cramér–von Mises–type discrepancy between ff and its Dirac mixture approximation f^(x)=1Ki=1Kδ(xu~i)\hat f(x) = \frac{1}{K}\sum_{i=1}^K \delta(x - \tilde u_i).

  • Offline Optimization: These points are solved for by offline gradient-based minimization of the integrated squared difference between FfF_f and Ff^F_{\hat f}, producing a deterministic, identity-covariance set with superior uniformity (Walker et al., 7 Oct 2025).
  • Runtime Linear Transformation: The sample set is mapped on-the-fly into the current proposal f(x)f(x)0 by f(x)f(x)1. Here, f(x)f(x)2 is the Cholesky factor of f(x)f(x)3. This step ensures sample points populate the high-probability region of the current search space (Walker et al., 7 Oct 2025, Walker et al., 7 Jan 2026).
  • Temporal Smoothness: To induce smooth time evolution in control sequences, a colored-noise prior is used by embedding a Toeplitz correlation matrix f(x)f(x)4 from a f(x)f(x)5 power spectrum; the sample transformation becomes f(x)f(x)6 with colored base samples (Walker et al., 7 Jan 2026).

3. Detailing dsCEM and dsMPPI Algorithms

Deterministic Cross-Entropy Method (dsCEM)

In dsCEM, the sample replacement occurs within the iterative CEM-MPC loop:

  • Deterministic sample generation: f(x)f(x)7 for f(x)f(x)8
  • Cost evaluation: f(x)f(x)9
  • Elite selection and update: next mean and covariance N(0,I)N(0,I)0, N(0,I)N(0,I)1 are computed over the N(0,I)N(0,I)2 lowest-cost samples (elites). Weighted updates or uniform weights are standard.
  • Momentum/adaptive schemes: Optional momentum averaging and two alternatives for covariance adaptation: fixed temporal correlation with adaptive marginal variances, or full covariance adaptation (Walker et al., 7 Oct 2025).

Deterministic Sampling MPPI (dsMPPI)

dsMPPI integrates these deterministic samples into the path integral (MPPI) framework using exponential soft weights rather than the CEM hard-threshold elite set:

  • Sample transformation: N(0,I)N(0,I)3, N(0,I)N(0,I)4
  • Trajectory simulation and cost: Each rollout is propagated, computing N(0,I)N(0,I)5
  • Exponential weighting: N(0,I)N(0,I)6
  • Soft averaging for parameter updates: The mean and diagonal variance are updated by exponentially weighted averages.
  • Momentum smoothing: Update N(0,I)N(0,I)7 and N(0,I)N(0,I)8 via N(0,I)N(0,I)9-weighted smoothing to prevent premature collapse and improve exploration/exploitation trade-offs.
  • Adaptive temperature: Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)0 is adapted based on an effective sample size metric (Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)1) so as to keep importance ratios well-behaved (Walker et al., 7 Jan 2026).

Both dsCEM and dsMPPI support alternative variation schemes, such as multi-iteration subsetting (drawing a subset from a larger pool across iterations) and coordinate permutation to increase effective exploration without reintroducing randomness (Walker et al., 7 Jan 2026).

4. Theoretical Insights and Computational Complexity

Deterministic sampling for MPC confers distinct theoretical benefits:

  • Variance reduction: Deterministic LCD samples uniformly tile the underlying distribution, minimizing quadrature error relative to i.i.d. sampling. For a fixed computational budget, deterministic methods achieve lower estimation variance and hence require fewer trajectories for equivalent accuracy (Walker et al., 7 Oct 2025, Walker et al., 7 Jan 2026).
  • Convergence and bias-variance tradeoff: Momentum-smoothing (Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)2) and adaptive temperature tuning (Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)3) regulate the update magnitude and ensure stability. dsCEM provides exact moment matching for quadratic costs and linear dynamics (Walker et al., 7 Jan 2026).
  • Complexity: Both dsCEM and dsMPPI maintain Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)4 total runtime per control step, with batch trajectory simulation being predominant. Deterministic sample generation and linear-algebraic updates are negligible relative to forward simulation costs (Walker et al., 7 Jan 2026, Walker et al., 7 Oct 2025).

5. Key Hyperparameters and Tuning Practices

Critical hyperparameters and empirically effective ranges for dsCEM and dsMPPI include (Walker et al., 7 Jan 2026):

Parameter Typical Range Significance
Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)5 50–300 # deterministic samples per iteration
Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)6 10–50 Horizon length (problem dependent)
Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)7 2–5 (usually 3) CEM/MPPI inner iterations
Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)8 Ff(m,b)=RdK(x;m,b)f(x)dx,K(x;m,b)=exp(xm22b2)F_f(m, b) = \int_{\mathbb{R}^d} K(x; m, b)f(x)dx, \qquad K(x; m, b) = \exp\left(-\frac{\|x-m\|^2}{2b^2}\right)9cost Inverse temperature; adapt via KK0
KK1 0.9–0.99 Momentum for mean/covariance updates
KK2 0.5–1.5 Colored-noise exponent for smoothness
KK3 1–5 Buffer size (tracking best rollouts)

A frequent guideline: start with KK4, KK5, KK6, and tune KK7 so that exponential weights are neither too diffuse nor collapsed, monitoring effective sample size KK8. For smoothing, set KK9 (Walker et al., 7 Jan 2026).

6. Empirical Performance and Comparison

Empirical evaluation on canonical nonlinear control tasks (e.g., cart-pole swing-up, truck backer-upper) demonstrates:

  • Smoother trajectories: dsMPPI yields ≈30% smoother controls (quantified by cumulative {u~i}\{\tilde u_i\}0) compared to dsCEM and ≈60% compared to classic MPPI. On the cart-pole, smoothness improves from {u~i}\{\tilde u_i\}1 (random MPPI) to {u~i}\{\tilde u_i\}2 (dsMPPI) (Walker et al., 7 Jan 2026).
  • Sample efficiency: dsCEM achieves comparable or lower cumulative cost than iCEM using {u~i}\{\tilde u_i\}3; e.g., on the mountain car, dsCEM achieves 40% lower cost and 50% smoother inputs at {u~i}\{\tilde u_i\}4 (Walker et al., 7 Oct 2025).
  • Computational parity: These deterministic-sampling methods do not incur extra asymptotic online costs relative to their random-sampling equivalents (Walker et al., 7 Jan 2026, Walker et al., 7 Oct 2025).
Method Cart-pole Cost Smoothness Truck Cost Smoothness
MPPI 245.3 ± 12.1 1.12e5 ± .10e5 162.8 ± 8.4 0.48e5 ± .05e5
Iterative MPPI 198.7 ± 9.3 0.83e5 ± .08e5 140.2 ± 6.7 0.39e5 ± .04e5
dsCEM 185.5 ± 8.7 0.61e5 ± .05e5 131.9 ± 5.4 0.31e5 ± .03e5
dsMPPI 188.2 ± 9.0 0.42e5 ± .03e5 133.4 ± 5.9 0.28e5 ± .02e5

Key findings: dsMPPI and dsCEM match or outperform random-sampling variants both in control cost and input smoothness, with the largest gains in demanding low-sample regimes (Walker et al., 7 Jan 2026, Walker et al., 7 Oct 2025).

7. Extensions, Generalizations, and Limitations

The deterministic-sampling principle is generic and can be transferred between MPPI and CEM approaches, as well as other stochastic sampling-based optimizers. LCD design can also incorporate task-specific priors (e.g. anisotropic variances, stronger colored-noise correlations) (Walker et al., 7 Oct 2025, Walker et al., 7 Jan 2026).

Known limitations include:

  • Scalability of LCD construction: The offline computation of the optimal Dirac mixture becomes challenging for high-dimensional spaces, and transforming LCD samples using non-isotropic covariances may degrade optimality.
  • Extension to other sampling schemes: In the context of direct policy optimization, deterministic sigma-point collocation achieves exact recovery of the LQR solution for linear–quadratic–Gaussian systems and reduces variance for mildly nonlinear cases (Howell et al., 2020).

A plausible implication is that direct deterministic-sampling frameworks may further benefit from adaptive Dirac-point generation and online adaptation when operating in high-dimensional or time-varying uncertainty regimes.


References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Deterministic Sampling (dsMPPI/dsCEM).