Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ergodicity of the underdamped mean-field Langevin dynamics (2007.14660v3)

Published 29 Jul 2020 in math.PR, cs.LG, math.OC, and stat.ML

Abstract: We study the long time behavior of an underdamped mean-field Langevin (MFL) equation, and provide a general convergence as well as an exponential convergence rate result under different conditions. The results on the MFL equation can be applied to study the convergence of the Hamiltonian gradient descent algorithm for the overparametrized optimization. We then provide a numerical example of the algorithm to train a generative adversarial networks (GAN).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017.
  2. S. Armstrong and J.-C. Mourrat. Variational methods for the kinetic Fokker-Planck equation. Preprint arXiv:1902.04037, 2019.
  3. Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré. J. Funct. Anal., 254(3):727–759, 2008.
  4. V. Bally. On the connection between the Malliavin covariance matrix and Hörmander’s condition. Journal of functional analysis, 96(2):219–255, 1991.
  5. Coupling constructions for hypoelliptic diffusions: two examples. Stochastic Analysis, Proceedings of Symposia in Pure Mathematics, 57, 1995.
  6. Wasserstein contraction for kinetic mean field particles system. Ongoing.
  7. Trend to equilibrium and particle approximation for a weakly selfconsistent Vlasov-Fokker-Planck equation. M2AN Math. Model. Numer. Anal., 44(5):867–884, 2010.
  8. Asymptotic behavior of an initial-boundary value problem for the Vlasov-Poisson-Fokker-Planck system. SIAM J. Appl. Math., 57(5):1343–1372, October 1997.
  9. Coupling and convergence for Hamiltonian Monte Carlo. Ann. Appl. Probab., 30(3):1209–1250, 2020.
  10. N. Bou-Rabee and K. Schuh. Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models. Preprint arXiv:2009.08735, 2020.
  11. Stochastic boundary conditions for molecular dynamics simulations of ST2 water. Chem. Phys. Lett., 105(5):495–500, 1984.
  12. On explicit L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-convergence rate estimate for underdamped Langevin dynamics. Preprint arXiv:1908.04746, 2019.
  13. P. Cardaliaguet. A short course on mean field games. Preprint, 2018.
  14. The master equation and the convergence problem in mean field games: (AMS-201). Princeton University Press, 2019.
  15. R. Carmona. Lectures on BSDEs, stochastic control, and stochastic differential games with financial applications. SIAM, 2016.
  16. R. Carmona and F. Delarue. Probabilistic theory of mean field games with applications. I, volume 83 of Probability Theory and Stochastic Modelling. Springer, Cham, 2018. Mean field FBSDEs, control, and games.
  17. P. Cattiaux and L. Mesnager. Hypoelliptic non-homogeneous diffusions. Probability Theory and Related Fields, 123(4):453–483, 2002.
  18. Sharp convergence rates for Langevin dynamics in the nonconvex setting. Preprint, 2020.
  19. Underdamped Langevin MCMC: A non-asymptotic analysis. Proceedings of Machine Learning research, 75:1–24, 2018.
  20. L. Chizat and F. Bach. On the global convergence of gradient descent for over-parameterized models using optimal transport. In Advances in neural information processing systems, pages 3040–3050, 2018.
  21. Game on Random Environement, Mean-field Langevin System and Neural networks. to appear in Mathematics of Operations Research, 2020.
  22. A. S. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society: Series B, 79(3):651–676, 2017.
  23. Exponential rate of convergence to equilibrium for a model describing fiber lay-down processes. Appl. Math. Res. express, (2):165–175, 2013.
  24. Hypocoercivity for kinetic equations with linear relaxation terms. Comptes Rendus Mathematique, 347(9):511–516, 2009.
  25. Hypocoercivity for linear kinetic equations conserving mass. Transactions of the American Mathematical Society, 367(6):3807–3828, 2015.
  26. A mean-field analysis of two-player zero-sum games. Advances in Neural Information Processing Systems, 33:20215–20226, 2020.
  27. M. H. Duong and J. Tugaut. The Vlasov-Fokker-Planck equation in non-convex landscapes: convergence to equilibrium. Electron. Commun. Probab., 23(19):1–10, 2018.
  28. P. Dupuis and R. S. Ellis. A Weak Convergence Approach to the Theory of Large Deviations. Wiley, 1997.
  29. A. Durmus and E. Moulines. Sampling from strongly log-concave distributions with the Unadjusted Langevin Algorithm. Preprint arXiv:1605.01559, 2016.
  30. A. Eberle. Reflection couplings and contraction rates for diffusions. Probability Theory and Related Fields, 166(3-4):851–886, 2016.
  31. Couplings and quantitative contraction rates for Langevin dynamics. Ann. Probab., 47(4):1982–2010, 2019.
  32. Quantitative Harris-type theorems for diffusions and McKean–Vlasov processes. Transactions of the American Mathematical Society, 371(10):7135–7173, 2019.
  33. A. Einstein. Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen. Annalen der Physik, 322(8):549–560, 1905.
  34. H. Föllmer. Time reversal on Wiener space. In S. A. Albeverio, P. Blanchard, and L. Streit, editors, Stochastic Processes - Mathematics and Physics, pages 119–129. Springer, 1986.
  35. J. Fontbona and B. Jourdain. A trajectorial interpretation of the dissipations of entropy and Fisher information for stochastic differential equations. Ann. Probab., 44(1):131–170, 2016.
  36. Efficient Metropolis Jumping Rules. Bayesian Statistics, 5:599–607, 1996.
  37. Generative Adversarial Nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 2672–2680, Cambridge, MA, USA, 2014. MIT Press.
  38. M. Grothaus and P. Stilgenbauer. Hilbert space hypocoercivity for the Langevin dynamics revisited. Methods Funct. Anal. Topology, 22(2):152–168, 2016.
  39. Convergence rates for the Vlasov-Fokker-Planck equation and uniform in time propagation of chaos in non convex cases. Electron. J. Probab., 27:1–44, 2022.
  40. The kinetic Fokker-Planck equation with mean field interaction. Journal de Mathématiques Pures et Appliquées, 150:1–23, 2021.
  41. A. Guillin and P. Monmarché. Uniform long-time and propagation of chaos estimates for mean field kinetic particles in non-convex landscapes. Journal of Statistical Physics, 185(2):1–20, 2021.
  42. U. G. Haussmann and E. Pardoux. Time reversal of diffusions. The Annals of Probability, 14(4):1188–1205, 1986.
  43. D. Henry. Geometric Theory of Semilinear Parabolic Equations. Springer, 1981.
  44. F. Hérau. Hypocoercivity and exponential time decay for the linear inhomogeneous relaxation Boltzmann equation. Asymptot. Anal., 46(3-4):349–359, 2006.
  45. Mean-field Langevin System, Optimal Control and Deep Neural Networks. Preprint arXiv:1909.07278, 2019.
  46. Mean-Field Langevin Dynamics and Energy Landscape of Neural Networks. Ann. Inst. H. Poincaré. Probab. Statist., 57(4):2043–2065, November 2021.
  47. Convergence rates for nonequilibrium Langevin dynamics. Ann. Math. Québec, 43(1):73–98, 2019.
  48. Mean-Field Neural ODEs via Relaxed Optimal Control. Preprint arXiv:1912.05475, 2019.
  49. J. Jacod and J. Memin. Weak and strong solutions of stochastic differential equations: Existence and stability. In D. Williams, editor, Stochastic Integrals, pages 169–212, Berlin, Heidelberg, 1981. Springer Berlin Heidelberg.
  50. S. M. Kozlov. Effective diffusion for the Fokker-Planck equation. Math. Notes, 45:360–368, 1989.
  51. P. Langevin. Sur la théorie du mouvement brownien. CR Acad. Sci. Paris, 146:530–533, 1908.
  52. B. Leimkuhler and C. Matthews. Molecular Dynamics. Interdisciplinary Applied Mathematics. Springer, Cham, 1 edition, 2015.
  53. Free Energy Computations. Imperial College Press, 2010.
  54. P.-L. Lions. Cours au Collège de France. www.college-de-france.fr.
  55. A Mean Field Analysis of Deep ResNet and Beyond: Towards Provably Optimization via Overparameterization from Depth. In H. D. III and A. Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 6426–6436. PMLR, 13–18 Jul 2020.
  56. D. Luo and J. Wang. Exponential convergence in Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT-Wasserstein distance for diffusion processes without uniformly dissipative drift. Mathematische Nachrichten, 289(14-15):1909–1926, 2016.
  57. Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise. Stochastic Processes and their Applications, 101(2):185–232, 2002.
  58. A mean field view of the landscape of two-layer neural networks. Proceedings of the National Academy of Sciences, 115(33):E7665–E7671, 2018.
  59. Spetrum of Ornstein-Uhlenbeck operators in Lpsuperscript𝐿𝑝{L}^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT spaces with respect to invariant measures. Journal of Functional Analysis, 196(1):40–60, 2002.
  60. Integration by parts and time reversal for diffusion processes. The Annals of Probability, 17(1):208–238, 1989.
  61. P. Monmarché. Long-time behaviour and propagation of chaos for mean field kinetic particles. Stochastic Processes and their Applications, 127(6):1721–1737, June 2017.
  62. R. M. Neal. MCMC using Hamiltonian dynamics. Handbook of Markov chain Monte Carlo. Boca Raton: CRC Press, 2011.
  63. E. Nelson. Dynamical theories of Brownian motion, volume 2. Princeton University Press, 1967.
  64. G. A. Pavliotis. Stochastic processes and applications: diffusion processes, the Fokker-Planck and Langevin equations, volume 60. Springer, 2014.
  65. L. Rey-Bellet and L. E. Thomas. Exponential convergence to non-equilibrium stationary states in classical statistical mechanics. Comm. Math. Phys., 225(2):305–329, 2002.
  66. G. Rotskoff and E. Vanden-Eijnden. Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error. arXiv:1805.00915, 2018.
  67. T. Schneider and E. Stoll. Molecular-dynamics study of a three-dimensional one-component model for distortive phase transitions. Phys. Rev. B, 17:1302–1322, February 1978.
  68. K. Schuh. Global contractivity for Langevin dynamics with distribution-dependent forces and uniform in time propagation of chaos. Preprint arXiv:2206.03082, 2022.
  69. A.-S. Sznitman. Topics in propagation of chaos. École d’Été de Probabilités de Saint-Flour XIX, Lecture Notes in Math., 1464:165–251, 1991.
  70. D. Talay. Stochastic Hamiltonian systems: exponential convergence to the invariant measure, and discretization by the implicit Euler scheme. Markov Process. Related Fields, 8(2):163–198, 2002.
  71. A. S. Üstünel and M. Zakai. Transformation of measure on Wiener space. Springer Science & Business Media, 2013.
  72. C. Villani. Hypocoercive diffusion operators. Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8), 10(2):257–275, 2007.
  73. C. Villani. Hypocoercivity. Mem. Amer. Math. Soc., 202(950):iv–141, 2009.
  74. D. Šiška and L. Szpruch. Gradient Flows for Regularized Stochastic Control Problems. Preprint arXiv:2006.05956, 2020.
  75. L. Wu. Large and moderate deviations and exponential convergence for stochastic damping Hamiltonian systems. Stochastic Processes and their Applications, 91(2):205–238, 2001.
Citations (15)

Summary

We haven't generated a summary for this paper yet.