Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
103 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
50 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Semi-Implicit Functional Gradient Flow for Efficient Sampling (2410.17935v2)

Published 23 Oct 2024 in stat.ML and cs.LG

Abstract: Particle-based variational inference methods (ParVIs) use nonparametric variational families represented by particles to approximate the target distribution according to the kernelized Wasserstein gradient flow for the Kullback-Leibler (KL) divergence. Although functional gradient flows have been introduced to expand the kernel space for better flexibility, the deterministic updating mechanism may limit exploration and require expensive repetitive runs for new samples. In this paper, we propose Semi-Implicit Functional Gradient flow (SIFG), a functional gradient ParVI method that uses perturbed particles with Gaussian noise as the approximation family. We show that the corresponding functional gradient flow, which can be estimated via denoising score matching with neural networks, exhibits strong theoretical convergence guarantees due to a higher-order smoothness brought to the approximation family via Gaussian perturbation. In addition, we present an adaptive version of our method that automatically selects the appropriate noise magnitude during sampling, striking a good balance between exploration efficiency and approximation accuracy. Extensive experiments on both simulated and real-world datasets demonstrate the effectiveness and efficiency of the proposed framework.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Optimizing functionals on the space of probabilities with input convex neural networks. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=dpOYN7o8Jm.
  2. A new learning algorithm for blind signal separation. Advances in neural information processing systems, 8, 1995.
  3. Variational inference: A review for statisticians. Journal of the American Statistical Association, 112:859 – 877, 2016.
  4. Olivier Bousquet. Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms. PhD thesis, École Polytechnique: Department of Applied Mathematics Paris, France, 2002.
  5. A unified particle-optimization framework for scalable bayesian sampling. ArXiv, abs/1805.11659, 2018.
  6. Stochastic gradient hamiltonian monte carlo. In International Conference on Machine Learning, pages 1683–1691, 2014.
  7. Particle-based variational inference with generalized wasserstein gradient flow. Advances in Neural Information Processing Systems, 2023.
  8. Kernel semi-implicit variational inference. arXiv preprint arXiv:2405.18997, 2024.
  9. Svgd as a kernelized wasserstein gradient flow of the chi-squared divergence. Advances in Neural Information Processing Systems, 33, 2020.
  10. Neural variational gradient descent. arXiv preprint arXiv:2107.10731, 2021.
  11. Forward-backward gaussian variational inference via jko in the bures–wasserstein space. In International Conference on Machine Learning, pages 7960–7991, 2023.
  12. Particle-based variational inference with preconditioned functional gradient flow. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=6OphWWAE3cS.
  13. Hybrid Monte Carlo. Physics Letters B, 195(2):216 – 222, 1987.
  14. Variational wasserstein gradient flow. In International Conference on Machine Learning, pages 6185–6215. PMLR, 2022.
  15. Learning the stein discrepancy for training and evaluating energy-based models without sampling. In International Conference on Machine Learning, pages 3732–3747. PMLR, 2020.
  16. Stein neural sampler. arXiv preprint arXiv:1810.03545, 2018.
  17. Theoretical guarantees for variational inference with fixed-variance mixture of gaussians. arXiv preprint arXiv:2406.04012, 2024.
  18. An introduction to variational methods for graphical models. Machine Learning, 37:183–233, 1999.
  19. Kernel stein discrepancy descent. In International Conference on Machine Learning, pages 5719–5730. PMLR, 2021.
  20. Variational inference via wasserstein gradient flows. Advances in Neural Information Processing Systems, 35, 2022.
  21. Particle semi-implicit variational inference. arXiv preprint arXiv:2407.00649, 2024.
  22. Understanding and accelerating particle-based variational inference. International Conference on Machine Learning. PMLR, pp. 4082-4092, 2019.
  23. Qiang Liu. Stein variational gradient descent as gradient flow. Advances in neural information processing systems, 30, 2017.
  24. Stein variational gradient descent: A general purpose bayesian inference algorithm. Advances in neural information processing systems, 29, 2016.
  25. Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models. Advances in Neural Information Processing Systems, 36, 2024.
  26. Radford Neal. MCMC using hamiltonian dynamics. In S Brooks, A Gelman, G Jones, and XL Meng, editors, Handbook of Markov Chain Monte Carlo, Chapman & Hall/CRC Handbooks of Modern Statistical Methods. Taylor & Francis, 2011. ISBN 9781420079425. URL http://books.google.com/books?id=qfRsAIKZ4rIC.
  27. G. O. Robert and O. Stramer. Langevin diffusions and metropolis-hastings algorithms. Methodology and Computing in Applied Probability, 4:337–357, 2002.
  28. Smoothness, low noise and fast rates. Advances in neural information processing systems, 23, 2010.
  29. Unbiased implicit variational inference. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 167–176. PMLR, 2019.
  30. Meg data for studies using independent component analysis. 1997. URL http://www.cis.hut.fi/projects/ica/eegmeg/MEGdata.html.
  31. Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
  32. Graphical models, exponential families, and variational inference. Foundations and Trends in Maching Learning, 1(1-2):1–305, 2008.
  33. Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019.
  34. Bayesian learning via stochastic gradient langevin dynamics. In International Conference on Machine Learning, pages 681–688, 2011.
  35. Semi-implicit variational inference. In International Conference on Machine Learning, pages 5660–5669. PMLR, 2018.
  36. Semi-implicit variational inference via score matching. arXiv preprint arXiv:2308.10014, 2023.

Summary

We haven't generated a summary for this paper yet.