Papers
Topics
Authors
Recent
Search
2000 character limit reached

Smoothed Proposals: Efficient Inference

Updated 16 April 2026
  • Smoothed proposals are techniques that integrate past and future observations using backward kernels to construct high-quality latent trajectory proposals for state-space and diffusion models.
  • They power methodologies like Metropolised FFBS, Particle Smoothing Variational Objectives, and guided proposals in continuous-discrete diffusion, each achieving improved precision and computational efficiency.
  • Empirical studies show that these methods reduce Monte Carlo variance, mitigate path degeneracy, and yield tighter variational bounds in complex, nonlinear systems.

Smoothed proposals are central to modern Monte Carlo-based smoothing methods for state-space and diffusion models. In the context of smoothing, the aim is to efficiently construct high-quality proposals for latent trajectory inference, leveraging all available observational data—both past and future. Smoothed proposals extend and improve over purely filtered proposals by incorporating backward information through backward kernels, guided backward simulation, or joint proposal-target construction, with the effect of reducing Monte Carlo variance, mitigating path degeneracy, and improving precision per computation cost. Major algorithmic instances include the Metropolised Forward Filtering Backward Sampling (FFBS) framework, Particle Smoothing Variational Objectives (SVO), and guided proposal schemes in continuous-diffusion smoothing, as well as their Rao–Blackwellised and hybrid extensions (Olsson et al., 2010, Moretti et al., 2019, Mider et al., 2017).

1. Theoretical Foundations of Smoothed Proposals

In hidden Markov models and diffusion processes, the smoothing distribution π(x0:Ty1:T)\pi(x_{0:T}\mid y_{1:T}) conditions a latent trajectory on all observations. Computation of this distribution is generally intractable outside restricted models (e.g., linear-Gaussian). Monte Carlo techniques, particularly particle methods, approximate the score or finite-dimensional marginals of this smoothing law.

Smoothed proposals are distinguished by their use of smoothing kernels, which, in contrast to filtered (forward) proposals, integrate future as well as past observations. In particle methods, this is achieved via backward kernels derived from the filtering distributions,

Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}

where ff is the transition and πt1\pi_{t-1} the filtered density up to t1t-1 (Olsson et al., 2010). These kernels permit the construction of backward trajectories, augmenting the proposal with future-informed dependencies and thus enabling genuine smoothing.

2. Metropolised Particle Smoothing and Rao–Blackwellisation

The Metropolised FFBS scheme constructs a Markov chain Monte Carlo (MCMC) kernel using two stages:

  • Forward particle filtering: simulate and weight NN particles via a user-specified proposal qtq_t and ancestors selected by normalized filtering weights;
  • Smoothed backward proposal: simulate backward trajectories via the empirical backward kernel

t(j)(xt(i))=Wt1(j)f(xt(i)xt1(j))k=1NWt1(k)f(xt(i)xt1(k))\ell_t^{(j)}(x_t^{(i)}) = \frac{W_{t-1}^{(j)} f(x_t^{(i)} \mid x_{t-1}^{(j)})}{\sum_{k=1}^N W_{t-1}^{(k)} f(x_t^{(i)} \mid x_{t-1}^{(k)})}

sampling at each tt from the set of particle ancestors (Olsson et al., 2010).

This proposal, embedded within a Metropolis–Hastings framework, yields an independent sampler whose acceptance ratio depends only on the marginal likelihood estimates produced by the forward filter,

α=1Z^TZ^T\alpha = 1 \wedge \frac{\hat Z_T^*}{\hat Z_T}

eliminating the need to compute full path densities recomputed backward.

Rao–Blackwellisation in this context refers to averaging over all possible backward trajectories (“backward smoothing”) rather than sampling a single path, yielding reduced Monte Carlo variance for estimators at computational cost linear in Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}0 (Olsson et al., 2010).

3. Smoothed Proposals in Variational Inference: Particle Smoothing Variational Objectives

Particle Smoothing Variational Objectives (SVO) extend the variational family to smoothed posteriors by augmenting the standard SMC proposal. SVO runs the forward SMC to generate Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}1 trajectories, then in the backward pass generates Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}2 subparticles per backward time step per trajectory from a continuous proposal Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}3, with categorical subsampling to select the backward trajectory. This constructs a smoothed proposal

Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}4

with augmented support (via the Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}5 subparticles per trajectory) (Moretti et al., 2019).

The SVO framework further introduces unbiased and biased gradient estimators with respect to inference-network parameters, demonstrating that so-called “drop-the-resampling-gradients” (biased) estimators, or relaxed Concrete/Gumbel-Softmax surrogates, possess signal-to-noise ratio (SNR) scaling Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}6 with the number of samples Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}7, in contrast to the Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}8 scaling for unbiased estimators (Moretti et al., 2019).

Empirical results demonstrate SVO proposals—by informing the proposal with all data—produce more accurate ELBOs and tighter variational bounds in nonlinear latent-dynamics models for fixed computational cost, relative to filtered (forward-only) objectives.

4. Guided Smoothing Proposals in Continuous-Discrete Diffusion Models

In continuous-discrete SDE smoothing, smoothed proposals can be constructed via guided bridges. The Backward Filtering Forward Guiding (BFFG) algorithm uses an auxiliary linear Gaussian process to perform backward filtering, yielding tractable time-varying information functions which parameterize a “guided” forward proposal:

Lt(xt1xt)=f(xtxt1)  πt1(xt1y1:t1)f(xtu)  πt1(uy1:t1)duL_t(x_{t-1} \mid x_t) = \frac{f(x_t\mid x_{t-1})\;\pi_{t-1}(x_{t-1}\mid y_{1:t-1})}{\int f(x_t\mid u)\;\pi_{t-1}(u\mid y_{1:t-1})\,du}9

where ff0 and ff1 is the backward information from the auxiliary model (Mider et al., 2017).

This forward-guided process serves as a proposal for the true smoothing law, with exactness maintained via Radon–Nikodym derivatives in an MCMC framework. The proposal integrates the backward information from all observations, resulting in efficient sampling and reduced path degeneracy in high-dimensional or non-linear settings (Mider et al., 2017).

5. Comparative Analysis and Computational Considerations

Smoothed proposals, implemented through FFBS–based or guided approaches, systematically outperform proposal mechanisms that use only forward filtering—especially for tasks requiring statistics of the full path or marginal posteriors significantly before the final observation.

Algorithmic comparison:

Method Backward Proposal Type Complexity per Iteration Rao–Blackwellisation
Metropolised FFBS (Olsson et al., 2010) Empirical kernel from SMC ff2 Yes
SVO (Moretti et al., 2019) Backward continuous + subsampling ff3 Not direct (averaging over sampled subparticles)
BFFG (Mider et al., 2017) Guided diffusion bridge ff4 Not direct

Both Metropolised particle smoothers and SVO exhibit increased precision per CPU second as the proposal becomes more smoothed (i.e., as more backward information or smoothing trajectories are included). Backward smoothing mitigates particle path degeneracy; full Rao–Blackwellisation further reduces Monte Carlo variance at no asymptotic extra cost (Olsson et al., 2010).

In continuous-discrete diffusion, the BFFG algorithm’s use of smoothed guided proposals achieves exactness (in the continuous-time limit), and demonstrates computational efficiency over local-imputation or disconnected-bridge alternatives (Mider et al., 2017).

6. Empirical Performance, Applications, and Limitations

Empirical studies in nonlinear latent-dynamics, including FitzHugh–Nagumo, Lorenz attractor, and neuronal voltage trace modeling, demonstrate that smoothed objectives (SVO) achieve tighter predictive ff5 and faster ELBO convergence relative to filtered bounds at smaller particle budgets (Moretti et al., 2019). In state-space MCMC, FFBS-based proposals produce higher effective sample size per computation especially when genealogical degeneracy undermines filtered-only backward simulations (Olsson et al., 2010). For continuous-diffusion smoothing, guided smoothed proposals enable efficient path and parameter inference in chaotic, high-dimensional, and partially observable domains (Mider et al., 2017).

The principal limitation of smoothed proposals is computational, as more sophisticated backward passes and smoothing kernels generally scale linearly with both the number of particles and time-steps (ff6), or, for guided bridges in SDEs, with the number of ODE steps per backward pass.

Smoothed proposals unify and extend a lineage of particle Markov chain Monte Carlo (PMCMC), SMC, and variational inference approaches. The Andrieu–Doucet–Holenstein (ADH) “Particle Independent Metropolis-Hastings” (PIMH) uses only filtered genealogical-backward sampling and thus is more susceptible to path degeneracy compared to FFBS-based smoothers. SVO builds on FIVO and AESMC, extending the variational proposal’s support by smoothing, which is fundamentally distinct from the time-factorized forward-only construction (Olsson et al., 2010, Moretti et al., 2019).

In continuous-time models, guided proposals link to Doob-transformed diffusions and Kalman-based information filtering, extending prior work to fully non-linear, non-Gaussian, and high-dimensional cases (Mider et al., 2017).

Smoothed proposals thus form a central methodological pillar in inference for state-space and diffusion models, combining theoretical rigor, practical variance reduction, and adaptability to challenging real-world systems.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Smoothed Proposals.