Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 189 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 451 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities (2403.02004v2)

Published 4 Mar 2024 in cs.LG, math.FA, math.OC, stat.CO, and stat.ML

Abstract: We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--{\L}ojasiewicz inequalities (LSI and P{\L}I, respectively), the flow converges exponentially fast to the set of minimizers of the free energy. We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the P{\L}I implies the so-called quadratic growth condition), and applying it to our new setting. We also generalize the Bakry--\'Emery Theorem and show that the LSI/P{\L}I generalization holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error, obtaining non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. “Interacting particle Langevin algorithm for maximum marginal likelihood estimation”, 2023
  2. Luigi Ambrosio, Nicola Gigli and Giuseppe Savaré “Gradient Flows: In Metric Spaces and in the Space of Probability Measures” Springer Science & Business Media, 2005
  3. Mihai Anitescu “Degenerate nonlinear programming with a quadratic growth condition” In SIAM Journal on Optimization 10, 2000, pp. 1116–1135
  4. “Diffusions hypercontractives” In Séminaire de Probabilités XIX 1983/84 Lecture Notes in Mathematics Springer, 1985
  5. Dominique Bakry, Ivan Gentil and Michel Ledoux “Analysis and Geometry of Markov Diffusion Operators” Springer, 2014
  6. Dmitri Burago, Yuri Burago and Sergei Ivanov “A Course in Metric Geometry” American Mathematical Society, 2001
  7. René Carmona “Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications” SIAM, 2016
  8. “Propagation of chaos: A review of models, methods and applications. I. Models and methods” In Kinetic and Related Models 15.6, 2022
  9. Rujian Chen “Approximate Bayesian Modeling with Embedded Gaussian Processes”, 2023
  10. “Convergence of Langevin MCMC in KL-divergence” In Proceedings of Algorithmic Learning Theory 83, 2018, pp. 186–211
  11. Sinho Chewi “Log-concave Sampling” Book draft, 2014 URL: https://chewisinho.github.io
  12. Arnak S. Dalalyan “Theoretical Guarantees for Approximate Sampling from Smooth and Log-Concave Densities” In Journal of the Royal Statistical Society Series B: Statistical Methodology 79.3, 2016, pp. 651–676
  13. Arthur P. Dempster, Nan M. Laird and Donald B. Rubin “Maximum likelihood from incomplete data via the EM Algorithm” In Journal of the Royal Statistical Society, Series B 39, 1977, pp. 2–38
  14. Steffen Dereich, Michael Scheutzow and Reik Schottstedt “Constructive quantization: Approximation by empirical measures” In Annales de l’Institut Henri Poincaré: Probabilités et statistiques 49, 2013, pp. 1183–1203
  15. Randal Douc, Éric Moulines and David Stoffer “Nonlinear Time Series: Theory, Methods and Applications with R Examples” CRC press, 2014
  16. “High-dimensional Bayesian inference via the unadjusted Langevin algorithm” In Bernoulli 25.4A, 2019, pp. 2854–2882
  17. “Nonasymptotic convergence analysis for the unadjusted Langevin algorithm” In Annals of Applied Probability 27, 2017, pp. 1551–1587
  18. “Gradient flows for empirical Bayes in high-dimensional linear models”, 2023
  19. Nicolas Fournier “Convergence of the empirical measure in expected Wasserstein distance: non-asymptotic explicit bounds in ℝdsuperscriptℝ𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT” In ESAIM: Probability and Statistics 27, 2023, pp. 749–775
  20. Richard Jordan, David Kinderlehrer and Felix Otto “The variational formulation of the Fokker–Planck equation” In SIAM Journal on Mathematical Analysis 29, 1998, pp. 1–17
  21. Hamed Karimi, Julie Nutini and Mark Schmidt “Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition” In Machine Learning and Knowledge Discovery in Databases, 2016, pp. 795–811
  22. Diederik P. Kingma and Max Welling “An Introduction to Variational Autoencoders” In Foundations and Trends® in Machine Learning 12, 2019, pp. 307–392
  23. Juan Kuntz, Jen Ning Lim and Adam M. Johansen “Particle algorithms for maximum likelihood training of latent variable models” In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics 206, 2023, pp. 5134–5180
  24. “Momentum particle maximum likelihood”, 2023
  25. Stanislaw Łojasiewicz “Une propriété topologique des sous-ensembles analytiques réels” In Les équations aux dérivées partielles 117, 1963, pp. 87–89
  26. “Is there an analog of Nesterov acceleration for gradient-based MCMC?” In Bernoulli 27.3 Bernoulli Society for Mathematical StatisticsProbability, 2021, pp. 1942–1992
  27. Radford M. Neal and Geoffrey E. Hinton “A View of the EM Algorithm that Justifies Incremental, Sparse, and other Variants” In Learning in Graphical Models Springer Netherlands, 1998, pp. 355–368
  28. Yurii Nesterov “A method of solving a convex programming problem with convergence rate 𝒪⁢(1k2)𝒪1superscript𝑘2\mathcal{O}\bigl{(}\frac{1}{k^{2}}\bigr{)}caligraphic_O ( divide start_ARG 1 end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )” In Doklady Akademii Nauk 269.3, 1983, pp. 543–547 Russian Academy of Sciences
  29. Yurii Nesterov “Introductory Lectures on Convex Optimization: A Basic Course” Springer Science & Business Media, 2003
  30. Bernt Øksendal “Stochastic Differential Equations: An Introduction with Applications” Springer Science & Business Media, 2013
  31. “Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality” In Journal of Functional Analysis 173, 2000, pp. 361–400
  32. Boris T. Polyak “Gradient methods for the minimisation of functionals (in Russian)” In Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki 3, 1963, pp. 643–653
  33. Herbert Robbins “An empirical Bayes approach to statistics” In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability 3.1, 1956, pp. 157–164
  34. Filippo Santambrogio “Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling” Birkhäuser/Springer, 2015
  35. Louis Sharrock, Daniel Dodd and Christopher Nemeth “CoinEM: Tuning-free particle-based variational inference for latent variable models”, 2023
  36. Michel Talagrand “Transportation cost for Gaussian and other product measures” In Geometric & Functional Analysis 6 Springer, 1996, pp. 587–600
  37. N.García Trillos, Bamdad Hosseini and Daniel Sanz-Alonso “From Optimization to Sampling Through Gradient Flows” In Notices of the American Mathematical Society 70.6, 2023
  38. Cédric Villani “Optimal Transport: Old and New” Springer Science & Business Media, 2009
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 4 tweets and received 19 likes.

Upgrade to Pro to view all of the tweets about this paper: