Convergence of kinetic Langevin samplers for non-convex potentials (2405.09992v1)
Abstract: We study three kinetic Langevin samplers including the Euler discretization, the BU and the UBU splitting scheme. We provide contraction results in $L1$-Wasserstein distance for non-convex potentials. These results are based on a carefully tailored distance function and an appropriate coupling construction. Additionally, the error in the $L1$-Wasserstein distance between the true target measure and the invariant measure of the discretization scheme is bounded. To get an $\varepsilon$-accuracy in $L1$-Wasserstein distance, we show complexity guarantees of order $\mathcal{O}(\sqrt{d}/\varepsilon)$ for the Euler scheme and $\mathcal{O}(d{1/4}/\sqrt{\varepsilon})$ for the UBU scheme under appropriate regularity assumptions on the target measure. The results are applicable to interacting particle systems and provide bounds for sampling probability measures of mean-field type.
- {barticle}[author] \bauthor\bsnmBaudoin, \bfnmFabrice\binitsF. (\byear2016). \btitleWasserstein contraction properties for hypoelliptic diffusions. \bjournalarXiv preprint arXiv:1602.04177. \endbibitem
- {barticle}[author] \bauthor\bsnmBaudoin, \bfnmFabrice\binitsF. (\byear2017). \btitleBakry-Émery meet Villani. \bjournalJ. Funct. Anal. \bvolume273 \bpages2275–2291. \bdoi10.1016/j.jfa.2017.06.021 \bmrnumber3677826 \endbibitem
- {barticle}[author] \bauthor\bsnmBolley, \bfnmFrançois\binitsF., \bauthor\bsnmGuillin, \bfnmArnaud\binitsA. and \bauthor\bsnmMalrieu, \bfnmFlorent\binitsF. (\byear2010). \btitleTrend to equilibrium and particle approximation for a weakly selfconsistent Vlasov-Fokker-Planck equation. \bjournalM2AN Math. Model. Numer. Anal. \bvolume44 \bpages867–884. \bdoi10.1051/m2an/2010045 \bmrnumber2731396 \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN., \bauthor\bsnmEberle, \bfnmAndreas\binitsA. and \bauthor\bsnmZimmer, \bfnmRaphael\binitsR. (\byear2020). \btitleCoupling and convergence for Hamiltonian Monte Carlo. \bjournalAnn. Appl. Probab. \bvolume30 \bpages1209–1250. \bdoi10.1214/19-AAP1528 \bmrnumber4133372 \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN. and \bauthor\bsnmKleppe, \bfnmTore Selland\binitsT. S. (\byear2023). \btitleRandomized Runge-Kutta-Nystr\\\backslash\”om. \bjournalarXiv preprint arXiv:2310.07399. \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN. and \bauthor\bsnmMarsden, \bfnmMilo\binitsM. (\byear2022). \btitleUnadjusted Hamiltonian MCMC with stratified Monte Carlo time integration. \bjournalarXiv preprint arXiv:2211.11003. \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN. and \bauthor\bsnmOberdörster, \bfnmStefan\binitsS. (\byear2023). \btitleMixing of Metropolis-Adjusted Markov Chains via Couplings: The High Acceptance Regime. \bjournalarXiv preprint arXiv:2308.04634. \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN. and \bauthor\bsnmSchuh, \bfnmKatharina\binitsK. (\byear2023). \btitleConvergence of unadjusted Hamiltonian Monte Carlo for mean-field models. \bjournalElectron. J. Probab. \bvolume28 \bpagesPaper No. 91, 40. \bdoi10.1214/23-ejp970 \bmrnumber4610714 \endbibitem
- {barticle}[author] \bauthor\bsnmBou-Rabee, \bfnmNawaf\binitsN. and \bauthor\bsnmSchuh, \bfnmKatharina\binitsK. (\byear2023). \btitleNonlinear Hamiltonian Monte Carlo & its Particle Approximation. \bjournalarXiv preprint arXiv:2308.11491. \endbibitem
- {barticle}[author] \bauthor\bsnmBrigati, \bfnmGiovanni\binitsG. and \bauthor\bsnmStoltz, \bfnmGabriel\binitsG. (\byear2023). \btitleHow to construct decay rates for kinetic Fokker–Planck equations? \bjournalarXiv preprint arXiv:2302.14506. \endbibitem
- {barticle}[author] \bauthor\bsnmBrünger, \bfnmAxel\binitsA., \bauthor\bsnmBrooks III, \bfnmCharles L\binitsC. L. and \bauthor\bsnmKarplus, \bfnmMartin\binitsM. (\byear1984). \btitleStochastic boundary conditions for molecular dynamics simulations of ST2 water. \bjournalChemical physics letters \bvolume105 \bpages495–500. \endbibitem
- {barticle}[author] \bauthor\bsnmBussi, \bfnmGiovanni\binitsG. and \bauthor\bsnmParrinello, \bfnmMichele\binitsM. (\byear2007). \btitleAccurate sampling using Langevin dynamics. \bjournalPhysical Review E \bvolume75 \bpages056707. \endbibitem
- {barticle}[author] \bauthor\bsnmChak, \bfnmMartin\binitsM. and \bauthor\bsnmMonmarché, \bfnmPierre\binitsP. (\byear2023). \btitleReflection coupling for unadjusted generalized Hamiltonian Monte Carlo in the nonconvex stochastic gradient case. \bjournalarXiv preprint arXiv:2310.18774. \endbibitem
- {barticle}[author] \bauthor\bsnmChen, \bfnmYuansi\binitsY. and \bauthor\bsnmGatmiry, \bfnmKhashayar\binitsK. (\byear2023). \btitleWhen does Metropolized Hamiltonian Monte Carlo provably outperform Metropolis-adjusted Langevin algorithm? \bjournalarXiv preprint arXiv:2304.04724. \endbibitem
- {barticle}[author] \bauthor\bsnmDalalyan, \bfnmArnak S.\binitsA. S. (\byear2017). \btitleTheoretical guarantees for approximate sampling from smooth and log-concave densities. \bjournalJ. R. Stat. Soc. Ser. B. Stat. Methodol. \bvolume79 \bpages651–676. \bdoi10.1111/rssb.12183 \bmrnumber3641401 \endbibitem
- {barticle}[author] \bauthor\bsnmDalalyan, \bfnmArnak S.\binitsA. S. and \bauthor\bsnmRiou-Durand, \bfnmLionel\binitsL. (\byear2020). \btitleOn sampling from a log-concave density using kinetic Langevin diffusions. \bjournalBernoulli \bvolume26 \bpages1956–1988. \bdoi10.3150/19-BEJ1178 \bmrnumber4091098 \endbibitem
- {barticle}[author] \bauthor\bsnmDolbeault, \bfnmJean\binitsJ., \bauthor\bsnmMouhot, \bfnmClément\binitsC. and \bauthor\bsnmSchmeiser, \bfnmChristian\binitsC. (\byear2009). \btitleHypocoercivity for kinetic equations with linear relaxation terms. \bjournalC. R. Math. Acad. Sci. Paris \bvolume347 \bpages511–516. \bdoi10.1016/j.crma.2009.02.025 \bmrnumber2576899 \endbibitem
- {barticle}[author] \bauthor\bsnmDolbeault, \bfnmJean\binitsJ., \bauthor\bsnmMouhot, \bfnmClément\binitsC. and \bauthor\bsnmSchmeiser, \bfnmChristian\binitsC. (\byear2015). \btitleHypocoercivity for linear kinetic equations conserving mass. \bjournalTrans. Amer. Math. Soc. \bvolume367 \bpages3807–3828. \bdoi10.1090/S0002-9947-2015-06012-7 \bmrnumber3324910 \endbibitem
- {barticle}[author] \bauthor\bsnmDurmus, \bfnmAlain\binitsA. and \bauthor\bsnmMoulines, \bfnmÉric\binitsE. (\byear2017). \btitleNonasymptotic convergence analysis for the unadjusted Langevin algorithm. \bjournalAnn. Appl. Probab. \bvolume27 \bpages1551–1587. \bdoi10.1214/16-AAP1238 \bmrnumber3678479 \endbibitem
- {barticle}[author] \bauthor\bsnmEberle, \bfnmAndreas\binitsA., \bauthor\bsnmGuillin, \bfnmArnaud\binitsA. and \bauthor\bsnmZimmer, \bfnmRaphael\binitsR. (\byear2019). \btitleCouplings and quantitative contraction rates for Langevin dynamics. \bjournalAnn. Probab. \bvolume47 \bpages1982–2010. \bdoi10.1214/18-AOP1299 \bmrnumber3980913 \endbibitem
- {barticle}[author] \bauthor\bsnmFoster, \bfnmJames\binitsJ., \bauthor\bsnmLyons, \bfnmTerry\binitsT. and \bauthor\bsnmOberhauser, \bfnmHarald\binitsH. (\byear2021). \btitleThe shifted ODE method for underdamped Langevin MCMC. \bjournalarXiv preprint arXiv:2101.03446. \endbibitem
- {barticle}[author] \bauthor\bsnmFoster, \bfnmJames M\binitsJ. M., \bauthor\bparticledos \bsnmReis, \bfnmGonçalo\binitsG. and \bauthor\bsnmStrange, \bfnmCalum\binitsC. (\byear2024). \btitleHigh Order Splitting Methods for SDEs Satisfying a Commutativity Condition. \bjournalSIAM Journal on Numerical Analysis \bvolume62 \bpages500–532. \endbibitem
- {barticle}[author] \bauthor\bsnmGiles, \bfnmMichael B.\binitsM. B. (\byear2015). \btitleMultilevel Monte Carlo methods. \bjournalActa Numer. \bvolume24 \bpages259–328. \bdoi10.1017/S096249291500001X \bmrnumber3349310 \endbibitem
- {barticle}[author] \bauthor\bsnmJasra, \bfnmA.\binitsA., \bauthor\bsnmHolmes, \bfnmC. C.\binitsC. C. and \bauthor\bsnmStephens, \bfnmD. A.\binitsD. A. (\byear2005). \btitleMarkov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. \bjournalStatist. Sci. \bvolume20 \bpages50–67. \bdoi10.1214/088342305000000016 \bmrnumber2182987 \endbibitem
- {barticle}[author] \bauthor\bsnmLeimkuhler, \bfnmBenedict\binitsB. and \bauthor\bsnmMatthews, \bfnmCharles\binitsC. (\byear2013). \btitleRational construction of stochastic numerical methods for molecular sampling. \bjournalAppl. Math. Res. Express. AMRX \bvolume1 \bpages34–56. \bdoi10.1093/amrx/abs010 \bmrnumber3040887 \endbibitem
- {barticle}[author] \bauthor\bsnmLeimkuhler, \bfnmBenedict\binitsB. and \bauthor\bsnmMatthews, \bfnmCharles\binitsC. (\byear2013). \btitleRobust and efficient configurational molecular sampling via Langevin dynamics. \bjournalThe Journal of chemical physics \bvolume138. \endbibitem
- {bbook}[author] \bauthor\bsnmLeimkuhler, \bfnmBen\binitsB. and \bauthor\bsnmMatthews, \bfnmCharles\binitsC. (\byear2015). \btitleMolecular dynamics. \bseriesInterdisciplinary Applied Mathematics \bvolume39. \bpublisherSpringer, Cham \bnoteWith deterministic and stochastic numerical methods. \bmrnumber3362507 \endbibitem
- {barticle}[author] \bauthor\bsnmLeimkuhler, \bfnmBenedict\binitsB., \bauthor\bsnmMatthews, \bfnmCharles\binitsC. and \bauthor\bsnmStoltz, \bfnmGabriel\binitsG. (\byear2016). \btitleThe computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics. \bjournalIMA J. Numer. Anal. \bvolume36 \bpages13–79. \bdoi10.1093/imanum/dru056 \bmrnumber3463433 \endbibitem
- {barticle}[author] \bauthor\bsnmLeimkuhler, \bfnmBenedict\binitsB., \bauthor\bsnmPaulin, \bfnmDaniel\binitsD. and \bauthor\bsnmWhalley, \bfnmPeter A\binitsP. A. (\byear2023). \btitleContraction and convergence rates for discretized kinetic Langevin dynamics. \bjournalarXiv preprint arXiv:2302.10684. \endbibitem
- {barticle}[author] \bauthor\bsnmLeimkuhler, \bfnmBenedict\binitsB., \bauthor\bsnmPaulin, \bfnmDaniel\binitsD. and \bauthor\bsnmWhalley, \bfnmPeter A\binitsP. A. (\byear2023). \btitleContraction Rate Estimates of Stochastic Gradient Kinetic Langevin Integrators. \bjournalarXiv preprint arXiv:2306.08592. \endbibitem
- {barticle}[author] \bauthor\bsnmLiang, \bfnmFaming\binitsF. and \bauthor\bsnmWong, \bfnmWing Hung\binitsW. H. (\byear2001). \btitleReal-parameter evolutionary Monte Carlo with applications to Bayesian mixture models. \bjournalJ. Amer. Statist. Assoc. \bvolume96 \bpages653–666. \bdoi10.1198/016214501753168325 \bmrnumber1946432 \endbibitem
- {barticle}[author] \bauthor\bsnmMangoubi, \bfnmOren\binitsO. and \bauthor\bsnmSmith, \bfnmAaron\binitsA. (\byear2017). \btitleRapid mixing of Hamiltonian Monte Carlo on strongly log-concave distributions. \bjournalarXiv preprint arXiv:1708.07114. \endbibitem
- {barticle}[author] \bauthor\bsnmMattingly, \bfnmJ. C.\binitsJ. C., \bauthor\bsnmStuart, \bfnmA. M.\binitsA. M. and \bauthor\bsnmHigham, \bfnmD. J.\binitsD. J. (\byear2002). \btitleErgodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise. \bjournalStochastic Process. Appl. \bvolume101 \bpages185–232. \bdoi10.1016/S0304-4149(02)00150-3 \bmrnumber1931266 \endbibitem
- {barticle}[author] \bauthor\bsnmMcLachlan, \bfnmRobert I.\binitsR. I. and \bauthor\bsnmQuispel, \bfnmG. Reinout W.\binitsG. R. W. (\byear2002). \btitleSplitting methods. \bjournalActa Numer. \bvolume11 \bpages341–434. \bdoi10.1017/S0962492902000053 \bmrnumber2009376 \endbibitem
- {barticle}[author] \bauthor\bsnmMei, \bfnmSong\binitsS., \bauthor\bsnmMontanari, \bfnmAndrea\binitsA. and \bauthor\bsnmNguyen, \bfnmPhan-Minh\binitsP.-M. (\byear2018). \btitleA mean field view of the landscape of two-layer neural networks. \bjournalProc. Natl. Acad. Sci. USA \bvolume115 \bpagesE7665–E7671. \bmrnumber3845070 \endbibitem
- {barticle}[author] \bauthor\bsnmMei, \bfnmSong\binitsS., \bauthor\bsnmMontanari, \bfnmAndrea\binitsA. and \bauthor\bsnmNguyen, \bfnmPhan-Minh\binitsP.-M. (\byear2018). \btitleA mean field view of the landscape of two-layer neural networks. \bjournalProc. Natl. Acad. Sci. USA \bvolume115 \bpagesE7665–E7671. \bdoi10.1073/pnas.1806579115 \bmrnumber3845070 \endbibitem
- {barticle}[author] \bauthor\bsnmMelchionna, \bfnmSimone\binitsS. (\byear2007). \btitleDesign of quasisymplectic propagators for Langevin dynamics. \bjournalThe Journal of chemical physics \bvolume127. \endbibitem
- {bbook}[author] \bauthor\bsnmMilstein, \bfnmG. N.\binitsG. N. and \bauthor\bsnmTretyakov, \bfnmM. V.\binitsM. V. (\byear2004). \btitleStochastic numerics for mathematical physics. \bseriesScientific Computation. \bpublisherSpringer-Verlag, Berlin. \bdoi10.1007/978-3-662-10063-9 \bmrnumber2069903 \endbibitem
- {barticle}[author] \bauthor\bsnmMonmarché, \bfnmPierre\binitsP. (\byear2021). \btitleHigh-dimensional MCMC with a standard splitting scheme for the underdamped Langevin diffusion. \bjournalElectron. J. Stat. \bvolume15 \bpages4117–4166. \bdoi10.1214/21-ejs1888 \bmrnumber4309974 \endbibitem
- {barticle}[author] \bauthor\bsnmMonmarché, \bfnmPierre\binitsP. (\byear2024). \btitleAn entropic approach for Hamiltonian Monte Carlo: the idealized case. \bjournalAnn. Appl. Probab. \endbibitem
- {barticle}[author] \bauthor\bsnmPaulin, \bfnmDaniel\binitsD. and \bauthor\bsnmWhalley, \bfnmPeter A.\binitsP. A. (\byear2024). \btitleCorrection to “Wasserstein distance estimates for the distributions of numerical approximations to ergodic stochastic differential equations”. \bjournalarXiv preprint arXiv: arXiv:2402.08711. \endbibitem
- {barticle}[author] \bauthor\bsnmRotskoff, \bfnmGrant\binitsG. and \bauthor\bsnmVanden-Eijnden, \bfnmEric\binitsE. (\byear2022). \btitleTrainability and Accuracy of Artificial Neural Networks: An Interacting Particle System Approach. \bjournalCommunications on Pure and Applied Mathematics \bvolume75 \bpages1889-1935. \endbibitem
- {barticle}[author] \bauthor\bsnmSanz-Serna, \bfnmJesus María\binitsJ. M. and \bauthor\bsnmZygalakis, \bfnmKonstantinos C.\binitsK. C. (\byear2021). \btitleWasserstein distance estimates for the distributions of numerical approximations to ergodic stochastic differential equations. \bjournalJ. Mach. Learn. Res. \bvolume22 \bpagesPaper No. 242, 37. \bdoi10.1080/14685248.2020.1855352 \bmrnumber4329821 \endbibitem
- {barticle}[author] \bauthor\bsnmSchuh, \bfnmKatharina\binitsK. (\byear2024). \btitleGlobal contractivity for Langevin dynamics with distribution-dependent forces and uniform in time propagation of chaos. \bjournalAnn. Inst. Henri Poincaré Probab. Stat. \endbibitem
- {barticle}[author] \bauthor\bsnmShen, \bfnmRuoqi\binitsR. and \bauthor\bsnmLee, \bfnmYin Tat\binitsY. T. (\byear2019). \btitleThe randomized midpoint method for log-concave sampling. \bjournalAdvances in Neural Information Processing Systems \bvolume32. \endbibitem
- {barticle}[author] \bauthor\bsnmSirignano, \bfnmJustin\binitsJ. and \bauthor\bsnmSpiliopoulos, \bfnmKonstantinos\binitsK. (\byear2022). \btitleMean field analysis of deep neural networks. \bjournalMath. Oper. Res. \bvolume47 \bpages120–152. \bmrnumber4403748 \endbibitem
- {barticle}[author] \bauthor\bsnmSkeel, \bfnmRobert D\binitsR. D. (\byear1999). \btitleIntegration schemes for molecular dynamics and related applications. \bjournalThe Graduate Student’s Guide to Numerical Analysis’ 98: Lecture Notes from the VIII EPSRC Summer School in Numerical Analysis \bpages119–176. \endbibitem
- {barticle}[author] \bauthor\bsnmSkeel, \bfnmRobert D\binitsR. D. and \bauthor\bsnmIzaguirre, \bfnmJesüs A\binitsJ. A. (\byear2002). \btitleAn impulse integrator for Langevin dynamics. \bjournalMolecular Physics \bvolume100 \bpages3885–3891. \endbibitem
- {barticle}[author] \bauthor\bsnmVillani, \bfnmCédric\binitsC. (\byear2009). \btitleHypocoercivity. \bjournalMem. Amer. Math. Soc. \bvolume202 \bpagesiv+141. \bdoi10.1090/S0065-9266-09-00567-5 \bmrnumber2562709 \endbibitem
- {barticle}[author] \bauthor\bsnmWu, \bfnmLiming\binitsL. (\byear2001). \btitleLarge and moderate deviations and exponential convergence for stochastic damping Hamiltonian systems. \bjournalStochastic Process. Appl. \bvolume91 \bpages205–238. \bdoi10.1016/S0304-4149(00)00061-2 \bmrnumber1807683 \endbibitem