Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sequential Monte Carlo for Inclusive KL Minimization in Amortized Variational Inference (2403.10610v1)

Published 15 Mar 2024 in cs.LG and stat.ML

Abstract: For training an encoder network to perform amortized variational inference, the Kullback-Leibler (KL) divergence from the exact posterior to its approximation, known as the inclusive or forward KL, is an increasingly popular choice of variational objective due to the mass-covering property of its minimizer. However, minimizing this objective is challenging. A popular existing approach, Reweighted Wake-Sleep (RWS), suffers from heavily biased gradients and a circular pathology that results in highly concentrated variational distributions. As an alternative, we propose SMC-Wake, a procedure for fitting an amortized variational approximation that uses likelihood-tempered sequential Monte Carlo samplers to estimate the gradient of the inclusive KL divergence. We propose three gradient estimators, all of which are asymptotically unbiased in the number of iterations and two of which are strongly consistent. Our method interleaves stochastic gradient updates, SMC samplers, and iterative improvement to an estimate of the normalizing constant to reduce bias from self-normalization. In experiments with both simulated and real datasets, SMC-Wake fits variational distributions that approximate the posterior more accurately than existing methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. “Forward Amortized Inference for Likelihood-Free Variational Marginalization” In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, 2019
  2. Christophe Andrieu, Arnaud Doucet and Roman Holenstein “Particle Markov chain Monte Carlo methods” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72.3, 2010, pp. 269–342 DOI: https://doi.org/10.1111/j.1467-9868.2009.00736.x
  3. Michael Arbel, Alex Matthews and Arnaud Doucet “Annealed Flow Transport Monte Carlo” In Proceedings of the 38th International Conference on Machine Learning, 2021 URL: https://proceedings.mlr.press/v139/arbel21a.html
  4. Michael Betancourt “Cruising the simplex: Hamiltonian Monte Carlo and the Dirichlet distribution” In AIP Conference Proceedings 1443.1, 2012, pp. 157–164 DOI: 10.1063/1.3703631
  5. David M. Blei, Alp Kucukelbir and Jon D. McAuliffe “Variational Inference: A Review for Statisticians” In Journal of the American Statistical Association 112.518 Informa UK Limited, 2017, pp. 859–877 DOI: 10.1080/01621459.2017.1285773
  6. “Reweighted Wake-Sleep” In 3rd International Conference on Learning Representations, 2015
  7. Yuri Burda, Roger B. Grosse and Ruslan Salakhutdinov “Importance Weighted Autoencoders” In 4th International Conference on Learning Representations, 2016
  8. “An Introduction to Sequential Monte Carlo” Springer, 2020
  9. Chris Cremer, Xuechen Li and David Duvenaud “Inference Suboptimality in Variational Autoencoders” In Proceedings of the 35th International Conference on Machine Learning, 2018 URL: https://proceedings.mlr.press/v80/cremer18a.html
  10. Pierre Del Moral “Feynman-Kac Formulae” In Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications Springer New York, 2004, pp. 47–93 DOI: 10.1007/978-1-4684-9393-1_2
  11. Pierre Del Moral, Arnaud Doucet and Ajay Jasra “Sequential Monte Carlo samplers” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3, 2006, pp. 411–436 DOI: https://doi.org/10.1111/j.1467-9868.2006.00553.x
  12. “Overview of the Instrumentation for the Dark Energy Spectroscopic Instrument” In The Astronomical Journal 164.5 The American Astronomical Society, 2022, pp. 1–62 DOI: 10.3847/1538-3881/ac882b
  13. Justin Domke and Daniel R Sheldon “Importance Weighting and Variational Inference” In Advances in Neural Information Processing Systems, 2018 URL: https://proceedings.neurips.cc/paper_files/paper/2018/file/25db67c5657914454081c6a18e93d6dd-Paper.pdf
  14. “The Sloan Digital Sky Survey: Technical Summary” In The Astronomical Journal 120.3, 2000, pp. 1579–1587 DOI: 10.1086/301513
  15. “Neural Spline Flows” In Advances in Neural Information Processing Systems, 2019 URL: https://proceedings.neurips.cc/paper_files/paper/2019/file/7ac71d433f282034e088473244df8c02-Paper.pdf
  16. David Greenberg, Marcel Nonnenmacher and Jakob Macke “Automatic Posterior Transformation for Likelihood-Free Inference” In Proceedings of the 36th International Conference on Machine Learning, 2019 URL: https://proceedings.mlr.press/v97/greenberg19a.html
  17. Shixiang (Shane) Gu, Zoubin Ghahramani and Richard E Turner “Neural Adaptive Sequential Monte Carlo” In Advances in Neural Information Processing Systems, 2015 URL: https://proceedings.neurips.cc/paper/2015/file/99adff456950dd9629a5260c4de21858-Paper.pdf
  18. “The DESI PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) Mock Challenge” arXiv, 2022 DOI: 10.48550/ARXIV.2202.01809
  19. Diederik P. Kingma and Max Welling “An Introduction to Variational Autoencoders” In Foundations and Trends in Machine Learning 12.4 Now Publishers, 2019, pp. 307–392 DOI: 10.1561/2200000056
  20. “Auto-Encoding Sequential Monte Carlo” In International Conference on Learning Representations, 2018 URL: https://openreview.net/forum?id=BJ8c3f-0b
  21. “Revisiting Reweighted Wake-Sleep for Models with Stochastic Control Flow” In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019 URL: http://proceedings.mlr.press/v115/le20a.html
  22. Erich L. Lehmann and George Casella “Theory of Point Estimation” Springer New York, 1998
  23. “Benchmarking Simulation-Based Inference” In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, 2021 URL: https://proceedings.mlr.press/v130/lueckmann21a.html
  24. “Filtering Variational Objectives” In Advances in Neural Information Processing Systems, 2017 URL: https://proceedings.neurips.cc/paper/2017/file/fa84632d742f2729dc32ce8cb5d49733-Paper.pdf
  25. “Continual Repeated Annealed Flow Transport Monte Carlo” In Proceedings of the 39th International Conference on Machine Learning, 2022 URL: https://proceedings.mlr.press/v162/matthews22a.html
  26. Kevin P. Murphy “Probabilistic Machine Learning: Advanced Topics” MIT Press, 2023 URL: http://probml.github.io/book2
  27. Christian Naesseth, Fredrik Lindsten and David Blei “Markovian Score Climbing: Variational Inference with KL(p||q)KL(p||q)italic_K italic_L ( italic_p | | italic_q )” In Advances in Neural Information Processing Systems, 2020 URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/b20706935de35bbe643733f856d9e5d6-Paper.pdf
  28. “Variational Sequential Monte Carlo” In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, 2018 URL: https://proceedings.mlr.press/v84/naesseth18a.html
  29. Christian A. Naesseth, Fredrik Lindsten and Thomas B. Schön “Elements of Sequential Monte Carlo” In Foundations and Trends in Machine Learning 12.3, 2019, pp. 307–392 DOI: 10.1561/2200000074
  30. Art B. Owen “Monte Carlo theory, methods and examples.”, 2013
  31. Rajesh Ranganath, Sean Gerrish and David Blei “Black Box Variational Inference” In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014 URL: https://proceedings.mlr.press/v33/ranganath14.html
  32. Gareth O. Roberts and Jeffrey S. Rosenthal “Optimal scaling of discrete approximations to Langevin diffusions” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60.1, 1998, pp. 255–268 DOI: https://doi.org/10.1111/1467-9868.00123
  33. Lawrence K. Saul, Tommi S. Jaakkola and Michael I. Jordan “Mean Field Theory for Sigmoid Belief Networks” In CoRR, 1996 URL: https://arxiv.org/abs/cs/9603102
  34. “sbi: A toolkit for simulation-based inference” In Journal of Open Source Software 5.52 The Open Journal, 2020, pp. 2505 DOI: 10.21105/joss.02505
  35. Martin J. Wainwright and Michael I. Jordan “Graphical Models, Exponential Families, and Variational Inference” In Foundations and Trends in Machine Learning 1.1-2, 2008, pp. 1–305 DOI: 10.1561/2200000001
  36. “Nested Variational Inference” In Advances in Neural Information Processing Systems, 2021 URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/ab49b208848abe14418090d95df0d590-Paper.pdf
  37. “Differentiable Particle Filtering without Modifying the Forward Pass”, 2021 arXiv:2106.10314 [stat.ML]
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com