Smoothed Proposals: Efficient Inference
- Smoothed proposals are techniques that integrate past and future observations using backward kernels to construct high-quality latent trajectory proposals for state-space and diffusion models.
- They power methodologies like Metropolised FFBS, Particle Smoothing Variational Objectives, and guided proposals in continuous-discrete diffusion, each achieving improved precision and computational efficiency.
- Empirical studies show that these methods reduce Monte Carlo variance, mitigate path degeneracy, and yield tighter variational bounds in complex, nonlinear systems.
Smoothed proposals are central to modern Monte Carlo-based smoothing methods for state-space and diffusion models. In the context of smoothing, the aim is to efficiently construct high-quality proposals for latent trajectory inference, leveraging all available observational data—both past and future. Smoothed proposals extend and improve over purely filtered proposals by incorporating backward information through backward kernels, guided backward simulation, or joint proposal-target construction, with the effect of reducing Monte Carlo variance, mitigating path degeneracy, and improving precision per computation cost. Major algorithmic instances include the Metropolised Forward Filtering Backward Sampling (FFBS) framework, Particle Smoothing Variational Objectives (SVO), and guided proposal schemes in continuous-diffusion smoothing, as well as their Rao–Blackwellised and hybrid extensions (Olsson et al., 2010, Moretti et al., 2019, Mider et al., 2017).
1. Theoretical Foundations of Smoothed Proposals
In hidden Markov models and diffusion processes, the smoothing distribution conditions a latent trajectory on all observations. Computation of this distribution is generally intractable outside restricted models (e.g., linear-Gaussian). Monte Carlo techniques, particularly particle methods, approximate the score or finite-dimensional marginals of this smoothing law.
Smoothed proposals are distinguished by their use of smoothing kernels, which, in contrast to filtered (forward) proposals, integrate future as well as past observations. In particle methods, this is achieved via backward kernels derived from the filtering distributions,
where is the transition and the filtered density up to (Olsson et al., 2010). These kernels permit the construction of backward trajectories, augmenting the proposal with future-informed dependencies and thus enabling genuine smoothing.
2. Metropolised Particle Smoothing and Rao–Blackwellisation
The Metropolised FFBS scheme constructs a Markov chain Monte Carlo (MCMC) kernel using two stages:
- Forward particle filtering: simulate and weight particles via a user-specified proposal and ancestors selected by normalized filtering weights;
- Smoothed backward proposal: simulate backward trajectories via the empirical backward kernel
sampling at each from the set of particle ancestors (Olsson et al., 2010).
This proposal, embedded within a Metropolis–Hastings framework, yields an independent sampler whose acceptance ratio depends only on the marginal likelihood estimates produced by the forward filter,
eliminating the need to compute full path densities recomputed backward.
Rao–Blackwellisation in this context refers to averaging over all possible backward trajectories (“backward smoothing”) rather than sampling a single path, yielding reduced Monte Carlo variance for estimators at computational cost linear in 0 (Olsson et al., 2010).
3. Smoothed Proposals in Variational Inference: Particle Smoothing Variational Objectives
Particle Smoothing Variational Objectives (SVO) extend the variational family to smoothed posteriors by augmenting the standard SMC proposal. SVO runs the forward SMC to generate 1 trajectories, then in the backward pass generates 2 subparticles per backward time step per trajectory from a continuous proposal 3, with categorical subsampling to select the backward trajectory. This constructs a smoothed proposal
4
with augmented support (via the 5 subparticles per trajectory) (Moretti et al., 2019).
The SVO framework further introduces unbiased and biased gradient estimators with respect to inference-network parameters, demonstrating that so-called “drop-the-resampling-gradients” (biased) estimators, or relaxed Concrete/Gumbel-Softmax surrogates, possess signal-to-noise ratio (SNR) scaling 6 with the number of samples 7, in contrast to the 8 scaling for unbiased estimators (Moretti et al., 2019).
Empirical results demonstrate SVO proposals—by informing the proposal with all data—produce more accurate ELBOs and tighter variational bounds in nonlinear latent-dynamics models for fixed computational cost, relative to filtered (forward-only) objectives.
4. Guided Smoothing Proposals in Continuous-Discrete Diffusion Models
In continuous-discrete SDE smoothing, smoothed proposals can be constructed via guided bridges. The Backward Filtering Forward Guiding (BFFG) algorithm uses an auxiliary linear Gaussian process to perform backward filtering, yielding tractable time-varying information functions which parameterize a “guided” forward proposal:
9
where 0 and 1 is the backward information from the auxiliary model (Mider et al., 2017).
This forward-guided process serves as a proposal for the true smoothing law, with exactness maintained via Radon–Nikodym derivatives in an MCMC framework. The proposal integrates the backward information from all observations, resulting in efficient sampling and reduced path degeneracy in high-dimensional or non-linear settings (Mider et al., 2017).
5. Comparative Analysis and Computational Considerations
Smoothed proposals, implemented through FFBS–based or guided approaches, systematically outperform proposal mechanisms that use only forward filtering—especially for tasks requiring statistics of the full path or marginal posteriors significantly before the final observation.
Algorithmic comparison:
| Method | Backward Proposal Type | Complexity per Iteration | Rao–Blackwellisation |
|---|---|---|---|
| Metropolised FFBS (Olsson et al., 2010) | Empirical kernel from SMC | 2 | Yes |
| SVO (Moretti et al., 2019) | Backward continuous + subsampling | 3 | Not direct (averaging over sampled subparticles) |
| BFFG (Mider et al., 2017) | Guided diffusion bridge | 4 | Not direct |
Both Metropolised particle smoothers and SVO exhibit increased precision per CPU second as the proposal becomes more smoothed (i.e., as more backward information or smoothing trajectories are included). Backward smoothing mitigates particle path degeneracy; full Rao–Blackwellisation further reduces Monte Carlo variance at no asymptotic extra cost (Olsson et al., 2010).
In continuous-discrete diffusion, the BFFG algorithm’s use of smoothed guided proposals achieves exactness (in the continuous-time limit), and demonstrates computational efficiency over local-imputation or disconnected-bridge alternatives (Mider et al., 2017).
6. Empirical Performance, Applications, and Limitations
Empirical studies in nonlinear latent-dynamics, including FitzHugh–Nagumo, Lorenz attractor, and neuronal voltage trace modeling, demonstrate that smoothed objectives (SVO) achieve tighter predictive 5 and faster ELBO convergence relative to filtered bounds at smaller particle budgets (Moretti et al., 2019). In state-space MCMC, FFBS-based proposals produce higher effective sample size per computation especially when genealogical degeneracy undermines filtered-only backward simulations (Olsson et al., 2010). For continuous-diffusion smoothing, guided smoothed proposals enable efficient path and parameter inference in chaotic, high-dimensional, and partially observable domains (Mider et al., 2017).
The principal limitation of smoothed proposals is computational, as more sophisticated backward passes and smoothing kernels generally scale linearly with both the number of particles and time-steps (6), or, for guided bridges in SDEs, with the number of ODE steps per backward pass.
7. Connections to Related Work
Smoothed proposals unify and extend a lineage of particle Markov chain Monte Carlo (PMCMC), SMC, and variational inference approaches. The Andrieu–Doucet–Holenstein (ADH) “Particle Independent Metropolis-Hastings” (PIMH) uses only filtered genealogical-backward sampling and thus is more susceptible to path degeneracy compared to FFBS-based smoothers. SVO builds on FIVO and AESMC, extending the variational proposal’s support by smoothing, which is fundamentally distinct from the time-factorized forward-only construction (Olsson et al., 2010, Moretti et al., 2019).
In continuous-time models, guided proposals link to Doob-transformed diffusions and Kalman-based information filtering, extending prior work to fully non-linear, non-Gaussian, and high-dimensional cases (Mider et al., 2017).
Smoothed proposals thus form a central methodological pillar in inference for state-space and diffusion models, combining theoretical rigor, practical variance reduction, and adaptability to challenging real-world systems.