Sequential Monte Carlo for Inclusive KL Minimization in Amortized Variational Inference (2403.10610v1)
Abstract: For training an encoder network to perform amortized variational inference, the Kullback-Leibler (KL) divergence from the exact posterior to its approximation, known as the inclusive or forward KL, is an increasingly popular choice of variational objective due to the mass-covering property of its minimizer. However, minimizing this objective is challenging. A popular existing approach, Reweighted Wake-Sleep (RWS), suffers from heavily biased gradients and a circular pathology that results in highly concentrated variational distributions. As an alternative, we propose SMC-Wake, a procedure for fitting an amortized variational approximation that uses likelihood-tempered sequential Monte Carlo samplers to estimate the gradient of the inclusive KL divergence. We propose three gradient estimators, all of which are asymptotically unbiased in the number of iterations and two of which are strongly consistent. Our method interleaves stochastic gradient updates, SMC samplers, and iterative improvement to an estimate of the normalizing constant to reduce bias from self-normalization. In experiments with both simulated and real datasets, SMC-Wake fits variational distributions that approximate the posterior more accurately than existing methods.
- “Forward Amortized Inference for Likelihood-Free Variational Marginalization” In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, 2019
- Christophe Andrieu, Arnaud Doucet and Roman Holenstein “Particle Markov chain Monte Carlo methods” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72.3, 2010, pp. 269–342 DOI: https://doi.org/10.1111/j.1467-9868.2009.00736.x
- Michael Arbel, Alex Matthews and Arnaud Doucet “Annealed Flow Transport Monte Carlo” In Proceedings of the 38th International Conference on Machine Learning, 2021 URL: https://proceedings.mlr.press/v139/arbel21a.html
- Michael Betancourt “Cruising the simplex: Hamiltonian Monte Carlo and the Dirichlet distribution” In AIP Conference Proceedings 1443.1, 2012, pp. 157–164 DOI: 10.1063/1.3703631
- David M. Blei, Alp Kucukelbir and Jon D. McAuliffe “Variational Inference: A Review for Statisticians” In Journal of the American Statistical Association 112.518 Informa UK Limited, 2017, pp. 859–877 DOI: 10.1080/01621459.2017.1285773
- “Reweighted Wake-Sleep” In 3rd International Conference on Learning Representations, 2015
- Yuri Burda, Roger B. Grosse and Ruslan Salakhutdinov “Importance Weighted Autoencoders” In 4th International Conference on Learning Representations, 2016
- “An Introduction to Sequential Monte Carlo” Springer, 2020
- Chris Cremer, Xuechen Li and David Duvenaud “Inference Suboptimality in Variational Autoencoders” In Proceedings of the 35th International Conference on Machine Learning, 2018 URL: https://proceedings.mlr.press/v80/cremer18a.html
- Pierre Del Moral “Feynman-Kac Formulae” In Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications Springer New York, 2004, pp. 47–93 DOI: 10.1007/978-1-4684-9393-1_2
- Pierre Del Moral, Arnaud Doucet and Ajay Jasra “Sequential Monte Carlo samplers” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3, 2006, pp. 411–436 DOI: https://doi.org/10.1111/j.1467-9868.2006.00553.x
- “Overview of the Instrumentation for the Dark Energy Spectroscopic Instrument” In The Astronomical Journal 164.5 The American Astronomical Society, 2022, pp. 1–62 DOI: 10.3847/1538-3881/ac882b
- Justin Domke and Daniel R Sheldon “Importance Weighting and Variational Inference” In Advances in Neural Information Processing Systems, 2018 URL: https://proceedings.neurips.cc/paper_files/paper/2018/file/25db67c5657914454081c6a18e93d6dd-Paper.pdf
- “The Sloan Digital Sky Survey: Technical Summary” In The Astronomical Journal 120.3, 2000, pp. 1579–1587 DOI: 10.1086/301513
- “Neural Spline Flows” In Advances in Neural Information Processing Systems, 2019 URL: https://proceedings.neurips.cc/paper_files/paper/2019/file/7ac71d433f282034e088473244df8c02-Paper.pdf
- David Greenberg, Marcel Nonnenmacher and Jakob Macke “Automatic Posterior Transformation for Likelihood-Free Inference” In Proceedings of the 36th International Conference on Machine Learning, 2019 URL: https://proceedings.mlr.press/v97/greenberg19a.html
- Shixiang (Shane) Gu, Zoubin Ghahramani and Richard E Turner “Neural Adaptive Sequential Monte Carlo” In Advances in Neural Information Processing Systems, 2015 URL: https://proceedings.neurips.cc/paper/2015/file/99adff456950dd9629a5260c4de21858-Paper.pdf
- “The DESI PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) Mock Challenge” arXiv, 2022 DOI: 10.48550/ARXIV.2202.01809
- Diederik P. Kingma and Max Welling “An Introduction to Variational Autoencoders” In Foundations and Trends in Machine Learning 12.4 Now Publishers, 2019, pp. 307–392 DOI: 10.1561/2200000056
- “Auto-Encoding Sequential Monte Carlo” In International Conference on Learning Representations, 2018 URL: https://openreview.net/forum?id=BJ8c3f-0b
- “Revisiting Reweighted Wake-Sleep for Models with Stochastic Control Flow” In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019 URL: http://proceedings.mlr.press/v115/le20a.html
- Erich L. Lehmann and George Casella “Theory of Point Estimation” Springer New York, 1998
- “Benchmarking Simulation-Based Inference” In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, 2021 URL: https://proceedings.mlr.press/v130/lueckmann21a.html
- “Filtering Variational Objectives” In Advances in Neural Information Processing Systems, 2017 URL: https://proceedings.neurips.cc/paper/2017/file/fa84632d742f2729dc32ce8cb5d49733-Paper.pdf
- “Continual Repeated Annealed Flow Transport Monte Carlo” In Proceedings of the 39th International Conference on Machine Learning, 2022 URL: https://proceedings.mlr.press/v162/matthews22a.html
- Kevin P. Murphy “Probabilistic Machine Learning: Advanced Topics” MIT Press, 2023 URL: http://probml.github.io/book2
- Christian Naesseth, Fredrik Lindsten and David Blei “Markovian Score Climbing: Variational Inference with KL(p||q)KL(p||q)italic_K italic_L ( italic_p | | italic_q )” In Advances in Neural Information Processing Systems, 2020 URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/b20706935de35bbe643733f856d9e5d6-Paper.pdf
- “Variational Sequential Monte Carlo” In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, 2018 URL: https://proceedings.mlr.press/v84/naesseth18a.html
- Christian A. Naesseth, Fredrik Lindsten and Thomas B. Schön “Elements of Sequential Monte Carlo” In Foundations and Trends in Machine Learning 12.3, 2019, pp. 307–392 DOI: 10.1561/2200000074
- Art B. Owen “Monte Carlo theory, methods and examples.”, 2013
- Rajesh Ranganath, Sean Gerrish and David Blei “Black Box Variational Inference” In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014 URL: https://proceedings.mlr.press/v33/ranganath14.html
- Gareth O. Roberts and Jeffrey S. Rosenthal “Optimal scaling of discrete approximations to Langevin diffusions” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60.1, 1998, pp. 255–268 DOI: https://doi.org/10.1111/1467-9868.00123
- Lawrence K. Saul, Tommi S. Jaakkola and Michael I. Jordan “Mean Field Theory for Sigmoid Belief Networks” In CoRR, 1996 URL: https://arxiv.org/abs/cs/9603102
- “sbi: A toolkit for simulation-based inference” In Journal of Open Source Software 5.52 The Open Journal, 2020, pp. 2505 DOI: 10.21105/joss.02505
- Martin J. Wainwright and Michael I. Jordan “Graphical Models, Exponential Families, and Variational Inference” In Foundations and Trends in Machine Learning 1.1-2, 2008, pp. 1–305 DOI: 10.1561/2200000001
- “Nested Variational Inference” In Advances in Neural Information Processing Systems, 2021 URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/ab49b208848abe14418090d95df0d590-Paper.pdf
- “Differentiable Particle Filtering without Modifying the Forward Pass”, 2021 arXiv:2106.10314 [stat.ML]