Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Particle-MALA and Particle-mGRAD: Gradient-based MCMC methods for high-dimensional state-space models (2401.14868v1)

Published 26 Jan 2024 in stat.CO and stat.ML

Abstract: State-of-the-art methods for Bayesian inference in state-space models are (a) conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated 'classical' MCMC algorithms like MALA, or mGRAD from Titsias and Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose $N$ particles at each time step to exploit the model's 'decorrelation-over-time' property and thus scale favourably with the time horizon, $T$ , but break down if the dimension of the latent states, $D$, is large. The latter leverage gradient-/prior-informed local proposals to scale favourably with $D$ but exhibit sub-optimal scalability with $T$ due to a lack of model-structure exploitation. We introduce methods which combine the strengths of both approaches. The first, Particle-MALA, spreads $N$ particles locally around the current state using gradient information, thus extending MALA to $T > 1$ time steps and $N > 1$ proposals. The second, Particle-mGRAD, additionally incorporates (conditionally) Gaussian prior dynamics into the proposal, thus extending the mGRAD algorithm to $T > 1$ time steps and $N > 1$ proposals. We prove that Particle-mGRAD interpolates between CSMC and Particle-MALA, resolving the 'tuning problem' of choosing between CSMC (superior for highly informative prior dynamics) and Particle-MALA (superior for weakly informative prior dynamics). We similarly extend other 'classical' MCMC approaches like auxiliary MALA, aGRAD, and preconditioned Crank-Nicolson-Langevin (PCNL) to $T > 1$ time steps and $N > 1$ proposals. In experiments, for both highly and weakly informative prior dynamics, our methods substantially improve upon both CSMC and sophisticated 'classical' MCMC approaches.

Citations (1)

Summary

  • The paper integrates CSMC with gradient-informed MCMC proposals to enhance scalability across large time horizons and high-dimensional latent states.
  • It introduces Particle-MALA and Particle-mGRAD methods, including twisted variants that dynamically adjust proposal distributions for improved inference.
  • Numerical experiments on stochastic volatility models demonstrate higher effective sample sizes and computational efficiency, paving the way for future innovations.

Summary of Particle-MALA and Particle-mGRAD: Gradient-based MCMC methods for high-dimensional state-space models

The paper introduces a series of novel methods for performing Bayesian inference in high-dimensional state-space models, addressing limitations in existing Markov Chain Monte Carlo (MCMC) methods. It builds upon two prominent approaches: Conditional Sequential Monte Carlo (CSMC) and classical MCMC algorithms like Metropolis-Adjusted Langevin Algorithm (MALA) and Metropolis Gradient (MGRAD). The paper proposes innovative methodologies that combine the strengths of both to efficiently handle large time horizons and dimensions of latent states.

Key Contributions

  1. Combination of CSMC and MCMC: The research focuses on leveraging time-series decorrelation properties inherent in CSMC with gradient-informed proposals from MCMC approaches like MALA and MGRAD. This combination allows the proposed approaches—Particle-MALA, Particle-mGRAD and their variants—to scale favourably with both the number of time steps, TT, and the latent state's dimensionality, DD.
  2. Introduction of Novel Algorithms: Several algorithms are introduced:
    • Particle-MALA: Extends MALA to scenarios with T>1T > 1 and multiple proposals N>1N > 1.
    • Particle-mGRAD: Leverages conditionally Gaussian dynamics and gradient information to interpolate between CSMC and Particle-MALA.
    • Twisted Variants: The twisted versions combine future auxiliary variables to modify proposal distributions dynamically.
  3. Interpolation and Flexibility: One of the strong points of these methods is their interpolation capability. Particle-mGRAD is theoretically shown to dynamically adjust between the extremes of CSMC and Particle-MALA based on the informativeness of the model's prior dynamics.
  4. Algorithm Validity & Efficiency: The paper carefully establishes the validity of the proposed methods via auxiliary and marginal algorithms, ensuring that any Markov kernel crafted using these methods maintains the required invariance properties.

Numerical Validation

Experiments conducted on a multivariate stochastic volatility model serve as a benchmark for these proposed methods, demonstrating notable improvements over traditional CSMC and classical MCMC approaches in terms of effective sample size (ESS). Particularly, the twisted Particle-mGRAD showcases promising performance by effectively balancing computational efficiency and sampling efficacy.

Implications for Future Research

The intersection of CSMC with advanced MCMC proposals opens avenues for further exploration into high-dimensional inference problems. By weaving together proposal efficiency from MCMC with the scalability of CSMC, the paper marks a significant step forward in complex state-space modeling tasks. This bridges a gap in existing methodologies, offering increased robustness and adaptability in diverse statistical modeling scenarios.

Theoretical implications suggest that future studies could further investigate optimal scaling rules for these hybrid methods, especially concerning proposal distribution parameters. Additionally, further adaptation and extension could involve hybrid preconditioned techniques and methods tailored for non-standard and constrained spaces.

In essence, the introduced Particle-MALA and Particle-mGRAD methodologies underscore an evolution in MCMC techniques, emphasizing the importance of flexibility and robustness in modern computational inference. This research sets the groundwork for further breakthroughs in the intersection of Bayesian methodology and computational scalability.

Youtube Logo Streamline Icon: https://streamlinehq.com