Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Perturbation Analysis of Markov Chain Monte Carlo for Graphical Models (2312.14246v1)

Published 21 Dec 2023 in math.PR

Abstract: The basic question in perturbation analysis of Markov chains is: how do small changes in the transition kernels of Markov chains translate to chains in their stationary distributions? Many papers on the subject have shown, roughly, that the change in stationary distribution is small as long as the change in the kernel is much less than some measure of the convergence rate. This result is essentially sharp for generic Markov chains. In this paper we show that much larger errors, up to size roughly the square root of the convergence rate, are permissible for many target distributions associated with graphical models. The main motivation for this work comes from computational statistics, where there is often a tradeoff between the per-step error and per-step cost of approximate MCMC algorithms. Our results show that larger perturbations (and thus less-expensive chains) still give results with small error.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Exact Kolmogorov and total variation distances between some familiar discrete distributions. Journal of Inequalities and Applications, 2006:1–8, 2006.
  2. The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37(2):697–725, 2009.
  3. Control variates for stochastic gradient MCMC. Statistics and Computing, 29(3):599––615, 2019.
  4. On Markov chain Monte Carlo methods for tall data. Journal of Machine Learning Research, 18(1):1515––1557, 2017.
  5. A note on geometric ergodicity and floating-point roundoff error. Statistics and Probability Letters, 53(2):123–127, 2001.
  6. On approximately counting colorings of small degree graphs. SIAM Journal on Computing, 29(2):387–400, 1999.
  7. Square Hellinger subadditivity for Bayesian networks and its applications to identity testing. In Proceedings of Machine Learning Research, volume 65, pages 697–703, 2017.
  8. Sample-optimal and efficient learning of tree Ising models. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 133–146, 2021.
  9. A new correlation inequality for Ising models with external fields. Probability Theory and Related Fields, 186:477–492, 2023.
  10. Correlation decay and deterministic FPTAS for counting colorings of a graph. Journal of Discrete Algorithms, 12:29–47, 2012.
  11. Improved mixing bounds for the anti-ferromagnetic Potts model on ℤ2superscriptℤ2\mathbb{Z}^{2}blackboard_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. LMS Journal of Computation and Mathematics, 9:1–20, 2006.
  12. Strong spatial mixing with fewer colors for lattice graphs. SIAM Journal on Computing, 35(2):486–517, 2005.
  13. Markov chain Markov field dynamics: Models and statistics. Statistics: A Journal of Theoretical and Applied Statistics, 36(4):339–363, 2002.
  14. Error bounds for approximations of Markov chains used in Bayesian sampling. arXiv:1711.05382, 2017.
  15. Scalable approximate MCMC algorithms for the horseshoe prior. Journal of Machine Learning Research, 21:1–61, 2020.
  16. No free lunch for approximate MCMC. arXiv:2010.12514, 2020.
  17. Tosio Kato. A Perturbation theory for linear operators. Springer Berlin Heidelberg, Berlin, Heidelberg, 1995.
  18. Austerity in MCMC land: Cutting the Metropolis-Hastings budget. In Proceedings of the 31st International Conference on Machine Learning, volume 32, pages 181–189, 2014.
  19. H. Kunsch. Time reversal and stationary Gibbs measures. Stochastic Processes and their Applications, 17(1):159–166, 1984.
  20. Markov Chains and Mixing Times. American Mathematical Society, Providence, Rhode Island, 2009.
  21. Fabio Martinelli. Lectures on Glauber dynamics for discrete spin models. Lectures on probability theory and statistics (Saint-Flour, 1997), 1717:93–191, 1999.
  22. Approach to equilibrium of Glauber dynamics in the one phase region: I. The attractive case. Communications in Mathematical Physics, 161(3):447–486, 1994.
  23. For 2-D lattice spin systems weak mixing implies strong mixing. Communications in Mathematical Physics, 165(1):33–47, 1994.
  24. A Yu Mitrophanov. Sensitivity and convergence of uniformly ergodic Markov chains. Journal of Applied Probability, 42(4):1003–1014, 2005.
  25. The true cost of stochastic gradient Langevin dynamics. arXiv:1706.02692, 2017.
  26. Approximations of geometrically ergodic reversible Markov chains. Advances in Applied Probability, 53(4):981––1022, 2021.
  27. Giorgio Parisi. Correlation functions and computer simulations. Nuclear Physics B, 180(3):378–384, 1981.
  28. Speeding up MCMC by efficient data subsampling. Journal of the American Statistical Association, 114(526):831–843, 2019.
  29. Subsampling MCMC - an introduction for the survey statistician. Sankhya A, 80(1):33–69, 2018.
  30. Computationally efficient inference for latent position network models. arXiv:1804.02274, 2018.
  31. Geometric Ergodicity and Hybrid Markov Chains. Electronic Communications in Probability, 2:13–25, 1997.
  32. Perturbation theory for Markov chains via Wasserstein distance. Bernoulli, 24(4A):2610–2639, 2018.
  33. Absence of phase transition for antiferromagnetic Potts models via the Dobrushin uniqueness theorem. Journal of Statistical Physics, 86:551–579, 1997.
  34. Dror Weitz. Counting independent sets up to the tree threshold. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pages 140–149, 2006.
  35. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML-11, pages 681––688, 2011.
Citations (1)

Summary

We haven't generated a summary for this paper yet.