Bayesian sampling using interacting particles (2401.13100v3)
Abstract: Bayesian sampling is an important task in statistics and machine learning. Over the past decade, many ensemble-type sampling methods have been proposed. In contrast to the classical Markov chain Monte Carlo methods, these new methods deploy a large number of interactive samples, and the communication between these samples is crucial in speeding up the convergence. To justify the validity of these sampling strategies, the concept of interacting particles naturally calls for the mean-field theory. The theory establishes a correspondence between particle interactions encoded in a set of coupled ODEs/SDEs and a PDE that characterizes the evolution of the underlying distribution. This bridges numerical algorithms with the PDE theory used to show convergence in time. Many mathematical machineries are developed to provide the mean-field analysis, and we showcase two such examples: The coupling method and the compactness argument built upon the martingale strategy. The former has been deployed to show the convergence of ensemble Kalman sampler and ensemble Kalman inversion, and the latter will be shown to be immensely powerful in proving the validity of the Vlasov-Boltzmann simulator.
- An introduction to MCMC for machine learning. Machine Learning, 50:5–43, 2003.
- K. Bergemann and S. Reich. A localization technique for ensemble kalman filters. Quarterly Journal of the Royal Meteorological Society, 136(648):701–707, 2010.
- K. Bergemann and S. Reich. A mollified ensemble kalman filter. Quarterly Journal of the Royal Meteorological Society, 136(651):1636–1643, 2010.
- P. Billingsley. Convergence of probability measures. John Wiley & Sons, 2013.
- G. Bird. Direct simulation and the Boltzmann equation. The Physics of Fluids, 13(11):2676–2681, 1970.
- Stochastic mean-field limit: Non-Llipschitz forces and swarming. Mathematical Models and Methods in Applied Sciences, 21(11):2179–2210, 2011.
- N. Bou-Rabee and J. M. Sanz-Serna. Randomized hamiltonian monte carlo. The Annals of Applied Probability, 27(4):2159–2194, 2017.
- A well-posedness theory in measures for some kinetic models of collective motion. Mathematical Models and Methods in Applied Sciences, 21(03):515–539, 2011.
- Ensemble kalman methods: A mean field perspective, 2022.
- On explicit L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-convergence rate estimate for Underdamped Langevin dynamics. Archive for Rational Mechanics and Analysis, 2019.
- An analytical framework for consensus-based global optimization method. Mathematical Models and Methods in Applied Sciences, 28(06):1037–1066, 2018.
- Consensus-based sampling. Studies in Applied Mathematics, 148(3):1069–1140, 2022.
- The mean field ensemble kalman filter: Near-gaussian setting. arXiv/2212.13239, 2023.
- J. A. Carrillo and U. Vaes. Wasserstein stability estimates for covariance-preconditioned fokker–planck equations. Nonlinearity, 34(4):2275–2295, 2021.
- L.-P. Chaintron and A. Diez. Propagation of chaos: A review of models, methods and applications. ii. applications. Kinetic and Related Models, 15(6):1017–1173, 2022.
- Underdamped Langevin MCMC: A non-asymptotic analysis. In Proceedings of the 31st Conference On Learning Theory, pages 300–323, 2018.
- Analysis of langevin monte carlo from poincare to log-sobolev. In Proceedings of Thirty Fifth Conference on Learning Theory, volume 178 of Proceedings of Machine Learning Research, pages 1–2. PMLR, 02–05 Jul 2022.
- Exponential ergodicity of mirror-langevin diffusions. In Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020.
- Optimal dimension dependence of the metropolis-adjusted langevin algorithm. In Annual Conference Computational Learning Theory, 2020.
- K. Craig and A. Bertozzi. A blob method for the aggregation equation. Mathematics of Computation, 85, 05 2014.
- A. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3):651–676, 2017.
- A. Dalalyan and A. Karagulyan. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient. Stochastic Processes and their Applications, 129(12):5278–5311, 2019.
- L. Desvillettes and C. Villani. On the trend to global equilibrium for spatially inhomogeneous kinetic systems: the Boltzmann equation. Inventiones mathematicae, 159(2):245–316, 2005.
- Z. Ding and Q. Li. Ensemble Kalman inversion: mean-field limit and convergence analysis. Statistics and Computing, 31, 2021.
- Z. Ding and Q. Li. Ensemble kalman sampler: Mean-field limit and convergence analysis. SIAM Journal on Mathematical Analysis, 53(2):1546–1578, 2021.
- Ensemble kalman inversion for nonlinear problems: Weights, consistency, and variance bounds. Foundations of Data Science, 3(3):371–411, 2021.
- Turbulence modeling in the age of data. Annual review of fluid mechanics, 51:357–377, 2019.
- Analysis of Langevin Monte Carlo via convex optimization. Journal of Machine Learning Research, 20(73):1–46, 2019.
- A. Durmus and É. Moulines. Non-asymptotic convergence analysis for the Unadjusted Langevin algorithm. Ann. Appl. Probab., 27(3):1551–1587, 2017.
- Log-concave sampling: Metropolis-hastings algorithms are fast! In S. Bubeck, V. Perchet, and P. Rigollet, editors, Proceedings of the 31st Conference On Learning Theory, volume 75 of Proceedings of Machine Learning Research, pages 793–797. PMLR, 06–09 Jul 2018.
- Couplings and quantitative contraction rates for Langevin dynamics. Annals of Probability, 47(4):1982–2010, 07 2019.
- Analysis of the ensemble and polynomial chaos kalman filters in bayesian inverse problems. SIAM/ASA Journal on Uncertainty Quantification, 3(1):823–851, 2015.
- Markov processes: characterization and convergence. John Wiley & Sons, 2009.
- G. Evensen. Sequential data assimilation with a nonlinear quasi-geostrophic model using monte carlo methods to forecast error statistics. Journal of Geophysical Research: Oceans, 99(C5):10143–10162, 1994.
- G. Evensen. The ensemble kalman filter: theoretical formulation and practical implementation. Ocean Dynamics, 53(4):343–367, 2003.
- G. Evensen. Data Assimilation-The Ensemble Kalman Filter. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
- P. Fabian. Atmospheric sampling. Advances in Space Research, 1(11):17–27, 1981.
- N. Fournier and A. Guillin. On the rate of convergence in wasserstein distance of the empirical measure. Probability Theory and Related Fields, 162(3):707–738, Aug 2015.
- D. Gamerman and H. F. Lopes. Markov chain Monte Carlo: stochastic simulation for Bayesian inference. CRC press, 2006.
- Interacting Langevin diffusions: gradient structure and ensemble Kalman sampler. SIAM Journal on Applied Dynamical Systems, 19(1):412–441, 2020.
- Applications of Estimation Theory to Numerical Weather Prediction. Springer New York, New York, NY, 1981.
- Markov chain Monte Carlo in practice. CRC press, 1995.
- C. Graham. Nonlinear diffusion with jumps. In Annales de l’IHP Probabilités et statistiques, 1992.
- Asymptotic behaviour of some interacting particle systems; Mckean-Vlasov and Boltzmann models. Probabilistic Models for Nonlinear Partial Differential Equations: Lectures given at the 1st Session of the Centro Internazionale Matematico Estivo (CIME) held in Montecatini Terme, Italy, May 22–30, 1995, pages 42–95, 1996.
- C. Graham and S. Méléard. Stochastic particle approximations for generalized Boltzmann models and convergence estimates. The Annals of probability, 25(1):115–132, 1997.
- A posteriori learning of closures for geophysical turbulence using ensemble Kalman inversion. Bulletin of the American Physical Society, 2023.
- P. Houtekamer and H. L. Mitchell. A sequential ensemble kalman filter for atmospheric data assimilation. Monthly Weather Review, 129(1):123–137, 2001.
- Ensemble Kalman methods for inverse problems. Inverse Problems, 29(4):045001, 2013.
- Evaluation of Gaussian approximations for data assimilation in reservoir models. Computational Geosciences, 17:851–885, 2013.
- J. Jacod and A. Shiryaev. Limit theorems for stochastic processes, volume 288. Springer Science & Business Media, 2013.
- The variational formulation of the fokker–planck equation. SIAM Journal on Mathematical Analysis, 29(1):1–17, 1998.
- R. E. Kalman. A New Approach to Linear Filtering and Prediction Problems. Journal of Basic Engineering, 82(1):35–45, 03 1960.
- New Results in Linear Filtering and Prediction Theory. Journal of Basic Engineering, 83(1):95–108, 03 1961.
- C. Kim. Boltzmann equation with a large potential in a periodic box. Communications in Partial Differential Equations, 39(8):1393–1423, 2014.
- T. G. Kurtz. Equivalence of stochastic equations and martingale problems. Stochastic analysis 2010, pages 113–130, 2011.
- Deterministic mean-field ensemble kalman filtering. SIAM Journal on Scientific Computing, 38(3):A1251–A1279, 2016.
- Large sample asymptotics for the ensemble kalman filter. Handbook on Nonlinear Filtering, 2011.
- Logsmooth gradient concentration and tighter runtimes for metropolized hamiltonian monte carlo. In Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria], volume 125 of Proceedings of Machine Learning Research, pages 2565–2597. PMLR, 2020.
- L. Li. On the trend to equilibrium for the Vlasov-Poisson-Boltzmann equation. Journal of Differential Equations, 244(6):1467–1501, 2008.
- Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science, 368(6490):489–493, 2020.
- Q. Liu and D. Wang. Stein variational gradient descent: A general purpose Bayesian inference algorithm. Advances in neural information processing systems, 29, 2016.
- Swarm-based gradient descent method for non-convex optimization. arXiv preprint arXiv:2211.17157, 2022.
- J. Lu and L. Wang. On explicit l2superscript𝑙2l^{2}italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-convergence rate estimate for piecewise deterministic Markov processes in MCMC algorithms. The Annals of Applied Probability, 32(2):1333–1361, 2022.
- Accelerating Langevin sampling with birth-death. arXiv preprint arXiv:1905.09863, 2019.
- P. Markowich and C. Villani. On the trend to equilibrium for the Fokker-Planck equation: An interplay between physics and functional analysis. In Physics and Functional Analysis, Matematica Contemporanea (SBM) 19, pages 1–29, 1999.
- Optimistic Bayesian sampling in contextual-bandit problems. Journal of Machine Learning Research, 13(1):2069––2106, 2012.
- On approximate Thompson sampling with Langevin algorithms. In Proceedings of the 37th International Conference on Machine Learning, pages 6797–6807, 2020.
- Reservoir-fluid sampling and characterization — key to efficient reservoir management. Journal of Petroleum Technology, 59(8):80–91, 2007.
- K. Nanbu. Direct simulation scheme derived from the Boltzmann equation. I. Monocomponent gases. Journal of the Physical Society of Japan, 49(5):2042–2049, 1980.
- B. Øksendal. Stochastic Differential Equations, pages 65–84. Springer Berlin Heidelberg, Berlin, Heidelberg, 2003.
- L. Pareschi and G. Russo. An introduction to Monte Carlo method for the Boltzmann equation. In ESAIM: Proceedings, volume 10, pages 35–75. EDP Sciences, 2001.
- G. Parisi. Correlation functions and computer simulations. Nuclear Physics B, 180(3):378 – 384, 1981.
- The cardiovascular system: mathematical modelling, numerical algorithms and clinical applications. Acta Numerica, 26:365–590, 2017.
- S. Reich. A dynamical systems framework for intermittent data assimilation. BIT Numerical Mathematics, 51(1):235–249, 2011.
- Monte Carlo statistical methods, volume 2. Springer, 1999.
- Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli, 2(4):341–363, 1996.
- Brownian dynamics as smart Monte Carlo simulation. The Journal of Chemical Physics, 69(10):4628–4633, 1978.
- A tutorial on Thompson sampling. Foundations and Trends in Machine Learning, 11(1):1––96, 2018.
- Earth system modeling 2.0: A blueprint for models that learn from observations and targeted high-resolution simulations. Geophysical Research Letters, 44(24):12–396, 2017.
- R. Shen and Y. T. Lee. The randomized midpoint method for log-concave sampling. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 2098–2109, 2019.
- T. Shiga and H. Tanaka. Central limit theorem for a system of Markovian particles with mean field interactions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 69(3):439–459, 1985.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
- A computer simulation method for the calculation of equilibrium constants for the formation of physical clusters of molecules: Application to small water clusters. The Journal of chemical physics, 76(1):637–649, 1982.
- A. Sznitman. Équations de type de Boltzmann, spatialement homogenes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 66:559–592, 1984.
- A. Sznitman. Topics in propagation of chaos. In Ecole d’Eté de Probabilités de Saint-Flour XIX — 1989, pages 165–251. Springer Berlin Heidelberg, 1991.
- E. Tadmor and A. Zenginoglu. Swarm-based optimization with random descent. arXiv preprint arXiv:2307.12441, 2023.
- H. Tanaka. Probabilistic treatment of the Boltzmann equation of Maxwellian molecules. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 46(1):67–105, 1978.
- H. Tanaka. Stochastic differential equation corresponding to the spatially homogeneous Boltzmann equation of Maxwellian and non-cutoff type. In Stochastic Processes: Selected Papers of Hiroshi Tanaka, pages 292–310. World Scientific, 2002.
- S. Vempala. Recent progress and open problems in algorithmic convex geometry. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2010), volume 8, pages 42–64, 2010.
- S. Vempala and A. Wibisono. Rapid convergence of the unadjusted langevin algorithm: Isoperimetry suffices. In Advances in Neural Information Processing Systems, volume 32, 2019.
- C. Villani. A review of mathematical topics in collisional kinetic theory. Handbook of mathematical fluid dynamics, 1(71-305):3–8, 2002.
- C. Villani et al. Optimal Transport: Old and New, volume 338. Springer, 2009.
- Minimax mixing time of the metropolis-adjusted langevin algorithm for log-concave sampling. Journal of Machine Learning Research, 2022.