Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GAD-PVI: A General Accelerated Dynamic-Weight Particle-Based Variational Inference Framework (2312.16429v1)

Published 27 Dec 2023 in cs.LG and cs.AI

Abstract: Particle-based Variational Inference (ParVI) methods approximate the target distribution by iteratively evolving finite weighted particle systems. Recent advances of ParVI methods reveal the benefits of accelerated position update strategies and dynamic weight adjustment approaches. In this paper, we propose the first ParVI framework that possesses both accelerated position update and dynamical weight adjustment simultaneously, named the General Accelerated Dynamic-Weight Particle-based Variational Inference (GAD-PVI) framework. Generally, GAD-PVI simulates the semi-Hamiltonian gradient flow on a novel Information-Fisher-Rao space, which yields an additional decrease on the local functional dissipation. GAD-PVI is compatible with different dissimilarity functionals and associated smoothing approaches under three information metrics. Experiments on both synthetic and real-world data demonstrate the faster convergence and reduced approximation error of GAD-PVI methods over the state-of-the-art.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media.
  2. Handbook of markov chain monte carlo. CRC press.
  3. Butcher, J. C. 1964. Implicit runge-kutta processes. Mathematics of Computation, 18(85): 50–64.
  4. Convergence to equilibrium in Wasserstein distance for damped Euler equations with interaction forces. Communications in Mathematical Physics, 365: 329–361.
  5. A unified particle-optimization framework for scalable Bayesian sampling. arXiv preprint arXiv:1805.11659.
  6. Stein points. In ICML, 844–853. PMLR.
  7. A blob method for the aggregation equation. Mathematics of computation, 85(300): 1681–1717.
  8. A JKO Splitting Scheme for Kantorovich–Fisher–Rao Gradient Flows. SIAM Journal on Mathematical Analysis, 49(2): 1100–1130.
  9. Interacting Langevin diffusions: Gradient structure and ensemble Kalman sampler. SIAM Journal on Applied Dynamical Systems, 19(1): 412–441.
  10. Kernel Stein Discrepancy Descent. In Meila, M.; and Zhang, T., eds., Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, 5719–5730. PMLR.
  11. A non-asymptotic analysis for Stein variational gradient descent. NeurIPS, 33.
  12. Lafferty, J. D. 1988. The Density Manifold and Configuration Space Quantization. Transactions of the American Mathematical Society, 305(2): 699–741.
  13. Sampling with Mollified Interaction Energy Descent. In The Eleventh International Conference on Learning Representations.
  14. Understanding and accelerating particle-based variational inference. In ICML, 4082–4092.
  15. A kernelized Stein discrepancy for goodness-of-fit tests. In ICML, 276–284. PMLR.
  16. Stein variational gradient descent: A general purpose bayesian inference algorithm. arXiv preprint arXiv:1608.04471.
  17. Accelerated first-order methods for geodesically convex optimization on Riemannian manifolds. Advances in Neural Information Processing Systems, 30.
  18. Accelerating langevin sampling with birth-death. arXiv preprint arXiv:1905.09863.
  19. Unbalanced Sobolev Descent. NeurIPS, 33.
  20. Stein Variational Gradient Descent: many-particle and long-time asymptotics. arXiv preprint arXiv:2102.12956.
  21. Numerical solution of stochastic differential equations with jumps in finance, volume 64. Springer Science & Business Media.
  22. Black box variational inference. In Artificial intelligence and statistics, 814–822. PMLR.
  23. Rasmussen, C. E. 2003. Gaussian processes in machine learning. In Summer school on machine learning, 63–71. Springer.
  24. Global convergence of neuron birth-death dynamics. arXiv preprint arXiv:1902.01843.
  25. Santambrogio, F. 2017. {{\{{Euclidean, metric, and Wasserstein}}\}} gradient flows: an overview. Bulletin of Mathematical Sciences, 7(1): 87–154.
  26. De-randomizing MCMC dynamics with the diffusion Stein operator. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 17507–17517. Curran Associates, Inc.
  27. An introduction to numerical analysis. Cambridge university press.
  28. Accelerated flow for probability distributions. In International Conference on Machine Learning, 6076–6085. PMLR.
  29. Mathematical theory of compressible fluid flow. Courier Corporation.
  30. Accelerated Information Gradient Flow. Journal of Scientific Computing, 90: 11.
  31. DPVI: A Dynamic-Weight Particle-Based Variational Inference Framework. In Raedt, L. D., ed., Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, 4900–4906. International Joint Conferences on Artificial Intelligence Organization. Main Track.
  32. An estimate sequence for geodesically convex optimization. In Conference On Learning Theory, 1703–1723. PMLR.
  33. Stochastic particle-optimization sampling and the non-asymptotic convergence theory. In Artificial Intelligence and Statistics, 1877–1887. PMLR.
  34. Variance Reduction and Quasi-Newton for Particle-Based Variational Inference. In ICML, 11576–11587. PMLR.
Citations (3)

Summary

We haven't generated a summary for this paper yet.