Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 64 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 136 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's t and Power Divergence (2312.01133v2)

Published 2 Dec 2023 in stat.ML and cs.LG

Abstract: The variational autoencoder (VAE) typically employs a standard normal prior as a regularizer for the probabilistic latent encoder. However, the Gaussian tail often decays too quickly to effectively accommodate the encoded points, failing to preserve crucial structures hidden in the data. In this paper, we explore the use of heavy-tailed models to combat over-regularization. Drawing upon insights from information geometry, we propose $t3$VAE, a modified VAE framework that incorporates Student's t-distributions for the prior, encoder, and decoder. This results in a joint model distribution of a power form which we argue can better fit real-world datasets. We derive a new objective by reformulating the evidence lower bound as joint optimization of KL divergence between two statistical manifolds and replacing with $\gamma$-power divergence, a natural alternative for power families. $t3$VAE demonstrates superior generation of low-density regions when trained on heavy-tailed synthetic data. Furthermore, we show that $t3$VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. N. Abiri and M. Ohlsson. Variational auto-encoders with Student’s t-prior. Proceedings of 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp.  415–420, 2020.
  2. S. Alexanderson and G. E. Henter. Robust model training and generalisation with Studentising flows. In Second ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, 2020.
  3. S.-i. Amari. Information geometry and its applications, volume 194. Springer, 2016.
  4. Generating heavy-tailed synthetic data with normalizing flows. In The 5th Workshop on Tractable Probabilistic Modeling, 2022.
  5. Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3):549–559, 1998.
  6. Learning imbalanced datasets with label-distribution-aware margin loss. In Advances in Neural Information Processing Systems, 2019.
  7. R. Child. Very deep VAEs generalize autoregressive models and can outperform them on images. In International Conference on Learning Representations, 2021.
  8. I. Csiszár. Information geometry and alternating minimization procedures. Statistics and Decisions, Dedewicz, 1:205–237, 1984.
  9. Hyperspherical variational auto-encoders. In 34th Conference on Uncertainty in Artificial Intelligence 2018, pp.  856–865. AUAI, 2018.
  10. α𝛼\alphaitalic_α-VAEs : Optimising variational inference by learning data-dependent divergence skew. In ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, 2021.
  11. Deep unsupervised clustering with Gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648, 2016.
  12. P. Ding. On the conditional distribution of the multivariate t distribution. The American Statistician, 70(3):293–295, 2016.
  13. S. Eguchi. Pythagoras theorem in information geometry and applications to generalized linear models. In Handbook of Statistics, volume 45, pp.  15–42. Elsevier, 2021.
  14. The Tilted variational autoencoder: improving out-of-distribution detection. In The Eleventh International Conference on Learning Representations, 2023.
  15. H. Fujisawa and S. Eguchi. Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis, 99(9):2053–2081, 2008.
  16. Variational inference based on robust divergences. In International Conference on Artificial Intelligence and Statistics, pp.  813–822. PMLR, 2018.
  17. Nonparametric variational auto-encoders for hierarchical representation learning. In Proceedings of the IEEE International Conference on Computer Vision, pp.  5094–5102, 2017.
  18. A kernel two-sample test. The Journal of Machine Learning Research, 13(1):723–773, 2012.
  19. From em-projections to variational auto-encoder. In NeurIPS 2020 Workshop: Deep Learning through Information Geometry, 2020.
  20. Hierarchical VAEs know what they don’t know. In International Conference on Machine Learning, 2021.
  21. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Advances in Neural Information Processing Systems, 30, 2017.
  22. beta-VAE: learning basic visual concepts with a constrained variational framework. In The Fifth International Conference on Learning Representations, 2017.
  23. Tails of Lipschitz triangular flows. In Proceedings of the 37th International Conference on Machine Learning, volume 119. PMLR, 2020.
  24. Dirichlet variational autoencoder. Pattern Recognition, 107:107–514, 2020.
  25. H. Kim and A. Mnih. Disentangling by factorising. In The 25th International Conference on Machine Learning, pp.  2649–2658. PMLR, 2018.
  26. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. 2014. arXiv preprint arXiv:1412.6980.
  27. D. P. Kingma and M. Welling. Auto-encoding variational bayes. 2013. arXiv preprint arXiv:1312.6114.
  28. A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009.
  29. Marginal tail-adaptive normalizing flows. In Proceedings of the 39th International Conference on Machine Learning, volume 162. PMLR, 2022.
  30. Y. Li and R. E. Turner. Rényi divergence variational inference. Advances in Neural Information Processing Systems, 29, 2016.
  31. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision, December 2015.
  32. Large-scale CelebFaces Attributes (CelebA) dataset. Retrieved August, 15(2018):11, 2018.
  33. Disentangling disentanglement in variational autoencoders. In International conference on machine learning, pp.  4402–4412. PMLR, 2019.
  34. E. Nalisnick and P. Smyth. Stick-breaking variational autoencoders. In The Fifth International Conference on Learning Representations, 2017.
  35. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
  36. D. Rezende and S. Mohamed. Variational inference with normalizing flows. In International Conference on Machine Learning, pp.  1530–1538. PMLR, 2015.
  37. Ladder variational autoencoders. In Advances in Neural Information Processing Systems, pp.  3745–3753, 2016.
  38. A. Subramanian. PyTorch-VAE. https://github.com/AntixK/PyTorch-VAE, 2020.
  39. Student-t variational autoencoder for robust density estimation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp.  2696–2702, 2018.
  40. J. Tomczak and M. Welling. VAE with a VampPrior. In The 21st International Conference on Artificial Intelligence and Statistics, pp.  1214–1223. PMLR, 2018.
  41. A. Vahdat and J. Kautz. NVAE: A deep hierarchical variational autoencoder. Advances in Neural Information Processing Systems, 33:19667–19679, 2020.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube