Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Preventing Model Collapse in Gaussian Process Latent Variable Models (2404.01697v2)

Published 2 Apr 2024 in stat.ML and cs.LG

Abstract: Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the underlying data structure. This paper addresses these issues by, first, theoretically examining the impact of projection variance on model collapse through the lens of a linear GPLVM. Second, we tackle model collapse due to inadequate kernel flexibility by integrating the spectral mixture (SM) kernel and a differentiable random Fourier feature (RFF) kernel approximation, which ensures computational scalability and efficiency through off-the-shelf automatic differentiation tools for learning the kernel hyperparameters, projection variance, and latent representations within the variational inference framework. The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders (VAEs) and other GPLVM variants, in terms of informative latent representations and missing data imputation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Abolhasanzadeh, B. Gaussian process latent variable model for dimensionality reduction in intrusion detection. In 2015 23rd Iranian Conference on Electrical Engineering, pp.  674–678. IEEE, 2015.
  2. Latent variable models in econometrics. Handbook of econometrics, 2:1321–1393, 1984.
  3. The isomap algorithm and topological stability. Science, 295(5552):7–7, 2002.
  4. Seeing what a GAN cannot generate. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  4502–4511, 2019.
  5. Bishop, C. M. Pattern Recognition and Machine Learning. Springer, 2006.
  6. Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan):993–1022, 2003.
  7. Bochner, S. A theorem on Fourier-Stieltjes integrals. Bulletin of the American Mathematical Society, 40(4):271–276, 1934.
  8. Generating sentences from a continuous space. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp.  10–21, 2016.
  9. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp.  108–122, 2013.
  10. Variational sparse inverse Cholesky approximation for latent Gaussian processes via double Kullback-Leibler minimization. In International Conference on Machine Learning, pp.  3559–3576. PMLR, 2023.
  11. Memory-based dual Gaussian processes for sequential learning. In International Conference on Machine Learning, pp.  4035–4054. PMLR, 2023.
  12. Rethinking Bayesian learning for data analysis: The art of prior and inference in sparsity-aware modeling. IEEE Signal Processing Magazine, 39(6):18–52, 2022.
  13. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7:e623, 2021.
  14. Learning GPLVM with arbitrary kernels using the unscented transformation. In International Conference on Artificial Intelligence and Statistics, pp.  451–459. PMLR, 2021.
  15. Duvenaud, D. Automatic model construction with Gaussian processes. PhD thesis, University of Cambridge, 2014.
  16. Gaussian process latent variable models for human pose estimation. In Machine Learning for Multimodal Interaction, pp.  132–143. Springer, 2008.
  17. Shared Gaussian process latent variable model for multi-view facial expression recognition. In International Symposium on Visual Computing, pp.  527–538. Springer, 2013.
  18. Single-cell RNA-seq denoising using a deep count autoencoder. Nature communications, 10(1):390, 2019.
  19. Scalable recommendation with hierarchical poisson factorization. In Conference on Uncertainty in Artificial Intelligence, pp.  326–335, 2015.
  20. Graves, A. Stochastic backpropagation through mixture density distributions. arXiv preprint arXiv:1607.05690, 2016.
  21. PixelVAE: A latent variable model for natural images. In International Conference on Learning Representations, 2016.
  22. Latent variable modeling with random features. In International Conference on Artificial Intelligence and Statistics, pp.  1333–1341. PMLR, 2021.
  23. Gaussian processes for big data. In Conference on Uncertainty in Artificial Intelligence, pp.  282–290, Arlington, Virginia, USA, 2013.
  24. Hotelling, H. Relations between two sets of variates. Biometrika, 1936.
  25. How to escape saddle points efficiently. In International conference on machine learning, pp.  1724–1732. PMLR, 2017.
  26. An introduction to variational methods for graphical models. Machine Learning, 37:183–233, 1999.
  27. Efficient approximate inference for stationary kernel on frequency domain. In International Conference on Machine Learning, pp.  10502–10538. PMLR, 2022.
  28. Factor analysis: Statistical methods and practical issues, volume 14. sage, 1978.
  29. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  30. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.
  31. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019.
  32. Generalised GPLVM with stochastic variational inference. In International Conference on Artificial Intelligence and Statistics, pp.  7841–7864. PMLR, 2022.
  33. Lawrence, N. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. Journal of Machine Learning Research, 6(60):1783–1816, 2005.
  34. LeCun, Y. The MNIST database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
  35. Shared autoencoder Gaussian process latent variable model for visual classification. IEEE Transactions on Neural Networks and Learning Systems, 29(9):4272–4286, 2017.
  36. Randomized nonlinear component analysis. In International Conference on Machine Learning, pp.  1359–1367. PMLR, 2014.
  37. Bayesian model selection, the marginal likelihood, and generalization. In International Conference on Machine Learning, pp.  14223–14247. PMLR, 2022.
  38. Don’t blame the ELBO! A linear VAE perspective on posterior collapse. Advances in Neural Information Processing Systems, 32, 2019.
  39. Forget-me-not! contrastive critics for mitigating posterior collapse. In Conference on Uncertainty in Artificial Intelligence, pp.  1360–1370. PMLR, 2022.
  40. Gromov-Wasserstein autoencoders. In Proceedings of International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=sbS10BCtc7.
  41. Bayesian nonparametric kernel-learning. In International Conference on Artificial Intelligence and Statistics, pp.  1078–1086. PMLR, 2016.
  42. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural iInformation Processing Systems, 32, 2019.
  43. Pearson, K. LIII. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572, 1901.
  44. Powered Dirichlet process-controlling the “rich-get-richer” assumption in bayesian clustering. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp.  611–626. Springer, 2023.
  45. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems, pp.  1177–1184, 2007.
  46. Latent Gaussian process with composite likelihoods and numerical quadrature. In International Conference on Artificial Intelligence and Statistics, pp.  3718–3726. PMLR, 2021.
  47. Gaussian Processes for Machine Learning. MIT Press, 2006.
  48. Preventing posterior collapse with delta-vaes. arXiv preprint arXiv:1901.03416, 2019.
  49. Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323–2326, 2000.
  50. Ladder variational autoencoders. Advances in Neural Information Processing Systems, 29, 2016.
  51. Similarity Gaussian process latent variable model for multi-modal data analysis. In Proceedings of the IEEE International Conference on Computer Vision, pp.  4050–4058, 2015.
  52. Theodoridis, S. Machine Learning: A Bayesian and Optimization Perspective. Academic Press, 2nd edition, 2020.
  53. Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 61(3):611–622, 1999.
  54. Titsias, M. Variational learning of inducing variables in sparse Gaussian processes. In International Conference on Artificial Intelligence and Statistics, pp.  567–574. PMLR, 2009.
  55. Bayesian Gaussian process latent variable model. In International Conference on Artificial Intelligence and Statistics, pp.  844–851. PMLR, 2010.
  56. Fully Bayesian autoencoders with latent sparse Gaussian processes. In International Conference on Machine Learning, pp.  34409–34430. PMLR, 23–29 Jul 2023.
  57. Tropp, J. A. An introduction to matrix concentration inequalities. Foundations and Trends® in Machine Learning, 8(1-2):1–230, 2015.
  58. Posterior collapse and latent variable non-identifiability. Advances in Neural Information Processing Systems, 34:5443–5455, 2021.
  59. Posterior collapse of a linear latent variable model. In Advances in Neural Information Processing Systems, 2022.
  60. Gaussian process kernels for pattern discovery and extrapolation. In International Conference on Machine Learning, pp.  1067–1075. PMLR, 2013.
  61. Bayesian deep learning and a probabilistic perspective of generalization. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pp.  4697–4708, 2020.
  62. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987.
  63. Improved variational autoencoders for text modeling using dilated convolutions. In International conference on machine learning, pp.  3881–3890. PMLR, 2017.
  64. Latent Variable Analysis and Signal Separation. Springer, 2010.
  65. Bayesian non-linear latent variable modeling via random fourier features. arXiv preprint arXiv:2306.08352, 2023.
  66. Variational autoencoders for sparse and overdispersed discrete data. In International Conference on Artificial Intelligence and Statistics, pp.  1684–1694. PMLR, 2020.
  67. Online clustered codebook. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  22798–22807, 2023.
  68. Markovian Gaussian process variational autoencoders. In International Conference on Machine Learning, pp.  42938–42961. PMLR, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com