Sequential Markov Chain Monte Carlo

Updated 11 November 2025

Sequential MCMC is a family of Bayesian sampling methods that propagates unweighted samples through evolving posterior distributions using adaptive MCMC kernels.
It addresses challenges in high-dimensional and online inference by avoiding weight degeneracy inherent in particle-based methods.
The approach supports parallel implementations and data partitioning, making it effective for real-time state-space filtering and complex dynamical models.

Sequential Markov Chain Monte Carlo (Sequential MCMC, SMCMC) encompasses a family of Bayesian computational methods for sampling from sequences of probability distributions, typically arising in time-evolving models, online Bayesian inference, high-dimensional filtering, or problems where the state/parameter space or the data arrive sequentially. The unifying principle is the propagation of a population of samples through a series of target distributions, using MCMC transitions that adapt as new data become available. SMCMC methods are an alternative to particle-based Sequential Monte Carlo (SMC) and particle MCMC, with distinctive theoretical and computational properties, particularly in high dimensions and for online or streaming applications.

1. Sequential MCMC Fundamentals and Algorithmic Framework

Sequential MCMC methods address the task of sampling from a sequence of posterior distributions $\{\pi_t\}$ over time as new data $Y_t$ arrive. At each step, the target posterior $\pi_t$ may increase in dimension (e.g., when latent variables or parameters are augmented), and direct sampling is often infeasible due to high dimensionality or complex dependencies.

The standard construction is as follows (Yang et al., 2013):

State augmentation: At time $t$ , represent the parameter vector as $\theta^{(t)} = (\theta^{(t-1)}, \eta_t)$ where $\eta_t$ may include new latent variables or data-augmentation terms.
Jumping (birth) kernel $J_t$ : Propose $\eta_t$ given $\theta^{(t-1)}$ , typically as the full conditional under $\pi_t$ , or a well-chosen proposal.
MCMC kernel $T_t$ : Apply a Markov transition kernel leaving $\pi_t$ invariant (Metropolis–Hastings, Gibbs, etc.), potentially executed $m_t$ times to ensure mixing.
Adaptive inner-loop: The number of within-time MCMC iterations $m_t$ is adjusted based on empirical mixing, usually via the maximum autocorrelation across chains or components, ensuring that samples approximate independence with respect to $\pi_t$ before advancing.

At each time point, $L$ parallel chains may be run to increase sample diversity and computational throughput. The final empirical sample at time $t$ approximates $\pi_t$ and is passed forward for the next round, ensuring online adaptation as data accumulate.

2. Theoretical Properties and Convergence Guarantees

Sequential MCMC admits rigorous convergence analysis under relatively mild conditions on the mixing of the within-time kernels and the gradual evolution of the stationary distributions. Key results include (Yang et al., 2013, Brockwell et al., 2012, Caprio et al., 2023):

Marginal convergence: The error in approximating $\pi_t$ by the empirical SMCMC ensemble $\hat\pi_t$ decays to zero as the number of chains and/or within-time iterations increases, provided the MCMC kernels $T_t$ are (at least uniformly or geometrically) ergodic and the change in stationary distribution $\alpha_t = \frac{1}{2}\|\pi_t - \pi_{t-1}\|_1$ is controlled.
Growing dimension: When parameter dimension grows (nonparametric models, latent variable augmentation), convergence holds assuming both the "jump" kernel $J_t$ and MCMC kernel $T_t$ are suitably mixing, and the newly introduced conditional posteriors match well.
Central Limit Theorems: For bounded test functions $f$ , $\sqrt{n}(\hat\pi_n^{(T)}(f) - \pi_T(f))$ converges in law to a normal distribution as $n\to\infty$ , with variance contributions from all past steps (Caprio et al., 2023).
Iterative improvement: In methods such as SIMCMC (Brockwell et al., 2012), repeated updates allow refinement of estimates over time, and $L^p$ error bounds of order $O(i^{-1/2})$ (where $i$ is the number of iterations) can be established for empirical expectations.

3. Comparison with SMC, Particle Methods, and Parallel Implementations

Key distinguishing features of Sequential MCMC methods include:

Avoidance of weight degeneracy: Unlike SMC, where a resample-move scheme is needed to avoid vanishing particle weights, SMCMC only propagates populations of unweighted samples, which bypasses the curse of dimensionality in weight-based approaches (Yang et al., 2013).
Parallelizability: The population-based structure and independence of chains in SMCMC make it amenable to distributed or embarrassingly parallel computation, leveraged in EP-SMCMC for massive time-series panels or those with block structure (Casarin et al., 2015, Freitas et al., 2015).
Divide-and-conquer strategies: For large datasets, or in high dimensions, SMCMC can be combined with data partitioning across subsets (nodes or blocks), coupled with expectation-propagation (EP) style moment matching to reconstruct the global posterior (Freitas et al., 2015).
Use of surrogate models and multifidelity filtering: For extremely costly forward models, SMCMC can be built atop a fidelity hierarchy, using coarse approximations early and finer models as necessary, guided by information-theoretic criteria (Catanach et al., 2020).

A summary contrast:

Feature	SMC	SMCMC	EP-SMCMC/Parallel SMCMC
Particle degeneracy	Present	Avoided	Avoided/mitigated
Scaling in high dimension	Poor (weights)	Better (population)	Near-linear (w. independence)
Data partitions possible	Indirect/rare	Natural (and efficient)	Built-in
Adaptive inner-loop	Not applicable	Mixes per ESS/autocorr.	Parallel and adaptive
Incorporation of surrogate/fidelity	Limited	Hierarchical possible	Multi-fidelity supported

4. Algorithmic Extensions and Practical Implementations

Significant variants and extensions of Sequential MCMC include:

Manifold SMCMC for degenerate/low-noise filters: Filtering with low or zero observation noise is handled via restriction to (possibly nonlinear) submanifolds, with manifold MCMC kernels on the constrained support (Zhumekenov et al., 7 Nov 2025).
Sequential MCMC for high-dimensional spatial filtering: For data assimilation with unknown or moving observation locations, SMCMC maintains joint distributions over system state and data location, preserving consistency and improving over ensemble Kalman filters for non-Gaussian and high-dimensional regimes (Ruzayqat et al., 2023).
Subsampling and data-parallel variants: For massive datasets, AS-SMCMC employs adaptive likelihood subsampling with probabilistic bounds on errors, while divide-and-conquer schemes use blockwise Markov chains, later merged using expectation propagation (Freitas et al., 2015).
Hybrid/extended space constructions: Model-based variants (e.g., in state-space models) integrate particle/SMC steps with within-step MCMC moves, or extended state spaces with local refinements, yielding hybrid PMCMC-SMCMC samplers (Carter et al., 2014).
Sequential-proposal and delayed-rejection SMCMC: Advanced proposal mechanisms, including Hamiltonian/NUTS-inspired and delayed-rejection kernels, fit naturally within sequential MCMC principles (Park et al., 2019).

5. Empirical Performance and Scaling Characteristics

SMCMC methods have demonstrated strong empirical performance, particularly in scenarios where SMC degenerates or where online recomputation is essential:

High-dimensional linear and nonlinear state-space models (Ruzayqat et al., 2023, Zhumekenov et al., 7 Nov 2025): SMCMC attains accuracy comparable to, or exceeding, ensemble Kalman filters (ETKF, ESTKF) and SMC, with 2–8× less CPU time for $d>10^3$ .
Massive panel time series (Casarin et al., 2015): Embarrassingly parallel SMCMC achieves MSE reductions (e.g., 2–4× lower for key parameters) and up to 80× wall-clock speedups over serial MCMC or particle filtering on 40-core machines.
Manifold and degenerate-noise filtering (Zhumekenov et al., 7 Nov 2025): SMCMC constructs on Riemannian submanifolds yield acceptance rates of 20–23% even in 100-dimensional SPDEs, with robust effective sample sizes.

The computational complexity is dominated by the cost of the local MCMC kernels and any required evaluation of forward models or observation operators; communication cost is minimal in parallel/EP settings. For manifold-based SMCMC in filtering, computational cost per step is $O(N\, c_\text{move})$ where $c_\text{move}$ is one manifold-projected move.

SMCMC occupies a central position within Bayesian computation for sequential/posterior updating:

Relation to SMC and Particle MCMC: SMCMC differs from SMC by maintaining only unweighted samples and applying full mixing at each time, thereby greatly reducing the risk of path degeneracy. Particle MCMC (and variants such as iPMCMC) combine SMC with MCMC moves, but SMCMC is typically more scalable when sufficient mixing is achievable.
Interaction with EP, distributed, and multi-fidelity models: EP-SMCMC leverages independence across data blocks and reconstructs the global posterior by moment matching and subposterior product approximations; multifidelity SMCMC (Catanach et al., 2020) extends these ideas to model hierarchies, governed by information-theoretic control of surrogate error.
Delayed-acceptance and sequential-proposal MCMC: Multi-stage or multi-proposal schemes (e.g., DR-MH, XCGHMC, sequential-proposal BPS) generalize to sequential contexts, improving mixing and acceptance for challenging posteriors (Park et al., 2019).

7. Outlook and Future Directions

Current research in Sequential MCMC is focused on:

Theory of adaptive sequential kernels: Recent advances in "MCMC Calculus" (Caprio et al., 2023) provide tools for rigorously studying the sensitivity of invariant measures, delivering quantitative mean-value inequalities and CLT results under inhomogeneous, sequential MCMC schemes.
SMCMC for nonparametric and infinite-dimensional settings: SMCMC accommodates growing parameter dimensions, under mild regularity, enabling scalable Bayesian nonparametrics (Yang et al., 2013).
Scalability and parallelism: Embarrassingly parallel algorithms and block-wise distributed implementations align with current hardware trends, with continued development in efficient communication, merging of subposterior samples, and online diagnostics for tuning.
Manifold and geometry-aware variants: SMCMC extensions to filtering on non-Euclidean subspaces, as required in degenerate-noise observation models, offer robust alternatives to conventional methods in state estimation, control, and complex dynamical systems (Zhumekenov et al., 7 Nov 2025).

Sequential MCMC thus provides a versatile, theoretically sound, and computationally scalable framework for time-evolving Bayesian inference, with broad applicability across high-dimensional statistics, data assimilation, systems biology, and online machine learning (Yang et al., 2013, Zhumekenov et al., 7 Nov 2025, Ruzayqat et al., 2023, Casarin et al., 2015, Catanach et al., 2020).