Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders (2403.08941v2)

Published 13 Mar 2024 in stat.ML and cs.LG

Abstract: Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Gtm: The generative topographic mapping. Neural computation, 10(1):215–234, 1998.
  2. Reweighted wake-sleep. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  3. Importance weighted autoencoders. In Yoshua Bengio and Yann LeCun, editors, 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
  4. Yen-Chi Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1(1):161–187, 2017.
  5. Alpha-divergence variational inference meets importance weighted auto-encoders: Methodology and asymptotics. J. Mach. Learn. Res., 24:243:1–243:83, 2023.
  6. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society: series B (methodological), 39(1):1–22, 1977.
  7. Importance Weighting and Variational Inference. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 4470–4479. Curran Associates, Inc., 2018.
  8. Divide and couple: Using monte carlo variational objectives for posterior approximation. Advances in neural information processing systems, 32, 2019.
  9. Lagging inference networks and posterior collapse in variational autoencoders. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
  10. The “wake-sleep” algorithm for unsupervised neural networks. Science, 268(5214):1158–1161, 1995.
  11. Efficient debiased evidence estimation by multilevel monte carlo sampling. In Uncertainty in Artificial Intelligence, pages 34–43. PMLR, 2021.
  12. Herman Kahn. Use of different Monte Carlo sampling techniques. Rand Corporation, 1955.
  13. Variational autoencoders and nonlinear ica: A unifying framework. In International Conference on Artificial Intelligence and Statistics, pages 2207–2217. PMLR, 2020.
  14. Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  15. Auto-encoding variational bayes. In Yoshua Bengio and Yann LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
  16. Normalizing flows: An introduction and review of current methods. IEEE transactions on pattern analysis and machine intelligence, 43(11):3964–3979, 2020.
  17. Revisiting reweighted wake-sleep for models with stochastic control flow. In Uncertainty in Artificial Intelligence, pages 1039–1049. PMLR, 2020.
  18. Rényi divergence variational inference. Advances in neural information processing systems, 29, 2016.
  19. The continuous bernoulli: fixing a pervasive error in variational autoencoders. Advances in Neural Information Processing Systems, 32, 2019.
  20. Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning, pages 4114–4124. PMLR, 2019.
  21. SUMO: unbiased estimation of log marginal probability for latent variable models. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
  22. The thermodynamic variational objective. Advances in Neural Information Processing Systems, 32, 2019.
  23. Sebastian Nowozin. Debiasing evidence approximations: On importance-weighted autoencoders and jackknife variational inference. In International conference on learning representations, 2018.
  24. Maurice H Quenouille. Approximate tests of correlation in time-series 3. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 45, pages 483–484. Cambridge University Press, 1949.
  25. Maurice H Quenouille. Notes on bias in estimation. Biometrika, 43(3/4):353–360, 1956.
  26. Variational inference with normalizing flows. In Francis R. Bach and David M. Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 1530–1538. JMLR.org, 2015.
  27. Sticking the landing: Simple, lower-variance gradient estimators for variational inference. Advances in Neural Information Processing Systems, 30, 2017.
  28. Handbook of approximate Bayesian computation. CRC Press, 2018.
  29. Importance weighted hierarchical variational inference. Advances in Neural Information Processing Systems, 32, 2019.
  30. Vae with a vampprior. In International Conference on Artificial Intelligence and Statistics, pages 1214–1223. PMLR, 2018.
  31. Doubly reparameterized gradient estimators for monte carlo objectives. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
  32. Natural wake-sleep algorithm. arXiv preprint arXiv:2008.06687, 2020.
  33. Posterior collapse and latent variable non-identifiability. Advances in Neural Information Processing Systems, 34:5443–5455, 2021.
  34. Deterministic latent variable models and their pitfalls. In Proceedings of the 2008 SIAM International Conference on Data Mining, pages 196–207. SIAM, 2008.
  35. Characterizing and avoiding problematic global optima of variational autoencoders. In Symposium on Advances in Approximate Bayesian Inference, pages 1–17. PMLR, 2020a.
  36. Failure modes of variational autoencoders and their effects on downstream tasks. arXiv preprint arXiv:2007.07124, 2020b.
  37. Semi-implicit variational inference. In International Conference on Machine Learning, pages 5660–5669. PMLR, 2018.
  38. The information autoencoding family: A lagrangian perspective on latent variable generative models. arXiv preprint arXiv:1806.06514, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yaniv Yacoby (9 papers)
  2. Weiwei Pan (39 papers)
  3. Finale Doshi-Velez (134 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com