Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bayesian Non-linear Latent Variable Modeling via Random Fourier Features (2306.08352v1)

Published 14 Jun 2023 in stat.ML, cs.AI, and cs.LG

Abstract: The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to overfitting, or variational approximations, which mischaracterize the posterior uncertainty. Here, we present a method to perform Markov chain Monte Carlo (MCMC) inference for generalized Bayesian nonlinear latent variable modeling. The crucial insight necessary to generalize GPLVMs to arbitrary observation models is that we approximate the kernel function in the Gaussian process mappings with random Fourier features; this allows us to compute the gradient of the posterior in closed form with respect to the latent variables. We show that we can generalize GPLVMs to non-Gaussian observations, such as Poisson, negative binomial, and multinomial distributions, using our random feature latent variable model (RFLVM). Our generalized RFLVMs perform on par with state-of-the-art latent variable models on a wide range of applications, including motion capture, images, and text data for the purpose of estimating the latent structure and imputing the missing data of these complex data sets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. GrandPrix: scaling up the Bayesian GPLVM for single-cell data. Bioinformatics, 35(1):47–54, 07 2018. ISSN 1367-4803. doi: 10.1093/bioinformatics/bty533. URL https://doi.org/10.1093/bioinformatics/bty533.
  2. D. J. Aldous. Exchangeability and related topics. In École d’Été de Probabilités de Saint-Flour XIII—1983, pages 1–198. Springer, 1985.
  3. C. E. Antoniak. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, pages 1152–1174, 1974.
  4. Multiple kernel learning, conic duality, and the SMO algorithm. In International Conference on Machine learning, page 6, 2004.
  5. The Isomap algorithm and topological stability. Science, 295(5552):7–7, 2002.
  6. Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan):993–1022, 2003.
  7. S. Bochner. Lectures on Fourier integrals, volume 42. Princeton University Press, 1959.
  8. R. B. Cattell. The description of personality: Principles and findings in a factor analysis. The American Journal of Psychology, 58(1):69–90, 1945.
  9. Scalable inference for logistic-normal topic models. In Advances in Neural Information Processing Systems, pages 2445–2453, 2013.
  10. Deep Gaussian processes. In Artificial Intelligence and Statistics, pages 207–215. PMLR, 2013.
  11. Variational Gaussian process dynamical systems. In Proceedings of the 24th International Conference on Neural Information Processing Systems, pages 2510–2518, 2011.
  12. Variational inference for latent variables and uncertain inputs in Gaussian processes. Journal of Machine Learning Research, 17(1):1425–1486, 2016.
  13. Hybrid Monte Carlo. Physics Letters B, 195(2):216–222, 1987.
  14. Gaussian process latent variable models for human pose estimation. In International workshop on Machine Learning for Multimodal Interaction, pages 132–143. Springer, 2007.
  15. Shared Gaussian process latent variable model for multi-view facial expression recognition. In International Symposium on Visual Computing, pages 527–538. Springer, 2013.
  16. Single-cell RNA-seq denoising using a deep count autoencoder. Nature Communications, 10(1):390, 2019.
  17. M. D. Escobar and M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430):577–588, 1995.
  18. T. S. Ferguson. A Bayesian analysis of some nonparametric problems. The Annals of Statistics, pages 209–230, 1973.
  19. Bayesian nonparametric inference of switching dynamic linear models. IEEE Transactions on Signal Processing, 59(4):1569–1585, 2011. doi: 10.1109/TSP.2010.2102756.
  20. Pseudo-marginal Bayesian inference for Gaussian process latent variable models. Machine Learning, 110:1105–1143, 2021.
  21. Latent Gaussian processes for distribution estimation of multivariate categorical data. pages 645–654, 2015.
  22. Bayesian data analysis. Chapman and Hall/CRC, 2013.
  23. Z. Ghahramani and G. E. Hinton. Parameter estimation for linear dynamical systems. Technical report, Technical Report CRG-TR-96-2, University of Toronto, Dept. of Computer Science, 1996.
  24. Scalable recommendation with hierarchical Poisson factorization. In UAI, pages 326–335, 2015.
  25. GPy. GPy: A Gaussian process framework in Python. http://github.com/SheffieldML/GPy. BSD 3-clause license., 2012.
  26. Latent variable modeling with random features. Artificial Intelligence and Statistics, 130:1333–1341, 2021.
  27. Variational Fourier features for Gaussian processes. Journal of Machine Learning Research, 18(1):5537–5588, 2017.
  28. G. Hinton and S. T. Roweis. Stochastic neighbor embedding. In Advances in Neural Information Processing Systems, volume 15, pages 833–840. Citeseer, 2002.
  29. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  30. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9:1735–80, 12 1997. doi: 10.1162/neco.1997.9.8.1735.
  31. Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Analysis, 1(1):145–168, 2006.
  32. H. Hotelling. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6):417, 1933.
  33. I. T. Jolliffe. Principal Component Analysis. Springer New York, 2002.
  34. R. E. Kálman. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1):35–45, 03 1960. ISSN 0021-9223. doi: 10.1115/1.3662552.
  35. Gaussian process latent variable alignment learning. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 748–757. PMLR, 2019.
  36. G. Kimeldorf and G. Wahba. Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 33(1):82–95, 1971.
  37. D. P. Kingma and M. Welling. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.
  38. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5(Jan):27–72, 2004.
  39. N. D. Lawrence. Gaussian process latent variable models for visualisation of high dimensional data. In Advances in Neural Information Processing Systems, pages 329–336, 2004.
  40. N. D. Lawrence. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. Journal of Machine Learning Research, 6(Nov):1783–1816, 2005.
  41. N. D. Lawrence. Learning for larger datasets with the Gaussian process latent variable model. In Artificial Intelligence and Statistics, pages 243–250, 2007.
  42. Hierarchical Gaussian process latent variable models. In Proceedings of the 24th International Conference on Machine Learning, pages 481–488, 2007.
  43. Local distance preservation in the GP-LVM through back constraints. In Proceedings of the 23rd International Conference on Machine Learning, pages 513–520, 2006.
  44. Sparse spectrum Gaussian process regression. Journal of Machine Learning Research, 11:1865–1881, 2010.
  45. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791, 1999.
  46. Dependent multinomial models made easy: Stick-breaking with the Pólya-gamma augmentation. In Advances in Neural Information Processing Systems, pages 3456–3464, 2015.
  47. Empirical models of spiking in neural populations. In Advances in Neural Information Processing Systems, pages 1350–1358, 2011.
  48. Decomposing feature-level variation with covariate Gaussian process latent variable models. In International Conference on Machine Learning, pages 4372–4381. PMLR, 2019.
  49. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
  50. J. Mercer. XVI. Functions of positive and negative type, and their connection the theory of integral equations. Philosophical Transactions of the Royal Society of London, Series A, 209(441-458):415–446, 1909.
  51. I. Murray and R. P. Adams. Slice sampling covariance hyperparameters of latent Gaussian models. Advances in Neural Information Processing Systems, 23:1732–1740, 2010.
  52. Elliptical slice sampling. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 541–548. JMLR Workshop and Conference Proceedings, 2010.
  53. R. M. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249–265, 2000.
  54. Learning and inferring motion patterns using parametric segmental switching linear dynamic systems. International Journal of Computer Vision, 77(1):103–124, 2008.
  55. Bayesian nonparametric kernel-learning. In Artificial Intelligence and Statistics, pages 1078–1086, 2016.
  56. How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026, 2013.
  57. Bayesian inference for logistic models using Pólya–gamma latent variables. Journal of the American Statistical Association, 108(504):1339–1349, 2013.
  58. A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems, pages 1177–1184, 2008.
  59. S. Roweis. EM algorithms for PCA and SPCA. Advances in Neural Information Processing Systems, pages 626–632, 1998.
  60. S. Roweis and Z. Ghahramani. A unifying review of linear Gaussian models. Neural Computation, 11(2):305–345, 1999.
  61. Active learning for deep Gaussian process surrogates. arXiv preprint arXiv:2012.08015, 2020.
  62. Think globally, fit locally: Unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4:119–155, 2003.
  63. Kernel principal component analysis. In International Conference on Artificial Neural Networks, pages 583–588. Springer, 1997.
  64. Q. She and A. Wu. Neural dynamics discovery via Gaussian process recurrent neural networks. In Uncertainty in Artificial Intelligence, pages 454–464. PMLR, 2020.
  65. E. Snelson and Z. Ghahramani. Sparse Gaussian processes using pseudo-inputs. Advances in Neural Information Processing Systems, 18:1257–1264, 2005.
  66. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  67. C. Spearman. “General intelligence”, objectively determined and measured. The American Journal of Psychology, 15(2):201–292, 1904.
  68. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.
  69. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.
  70. M. E. Tipping. Sparse kernel principal component analysis. In Advances in Neural Information Processing Systems, pages 633–639, 2001.
  71. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3):611–622, 1999.
  72. M. Titsias and N. D. Lawrence. Bayesian Gaussian process latent variable model. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 844–851, 2010.
  73. R. Urtasun and T. Darrell. Discriminative Gaussian process latent variable model for classification. In Proceedings of the 24th International Conference on Machine learning, pages 927–934, 2007.
  74. Topologically-constrained latent variable models. In Proceedings of the 25th International Conference on Machine Learning, pages 1080–1087, 2008.
  75. L. van der Maaten. Preserving local structure in Gaussian process latent variable models. In Proceedings of the 18th Annual Belgian-Dutch Conference on Machine Learning, pages 81–88. Citeseer, 2009.
  76. L. van der Maaten and G. Hinton. Visualizing data using t𝑡titalic_t-SNE. Journal of Machine Learning Research, 9(11), 2008.
  77. A. Verma and B. E. Engelhardt. A robust nonlinear low-dimensional manifold for single cell RNA-seq data. BMC Bioinformatics, 21(1):1–15, 2020.
  78. The unscented Kalman filter for nonlinear estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No. 00EX373), pages 153–158. IEEE, 2000.
  79. Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2):283–298, 2007.
  80. Gaussian processes for machine learning, volume 2. MIT Press, 2006.
  81. Gaussian process kernels for pattern discovery and extrapolation. In International Conference on Machine Learning, pages 1067–1075, 2013.
  82. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1):67–82, 1997.
  83. Gaussian process based nonlinear latent structure discovery in multivariate spike train data. In Advances in Neural Information Processing Systems, pages 3496–3505, 2017.
  84. Nyström method vs random Fourier features: A theoretical and empirical comparison. In Advances in Neural Information Processing Systems, pages 476–484, 2012.
  85. À la carte–learning fast kernels. In Artificial Intelligence and Statistics, pages 1098–1106. PMLR, 2015.
  86. M. M. Zhang. Sparse infinite random feature latent variable modeling. arXiv preprint arXiv:2205.09909, 2022.
  87. Variational autoencoders for sparse and overdispersed discrete data. In International Conference on Artificial Intelligence and Statistics, pages 1684–1694. PMLR, 2020.
  88. M. Zhou and L. Carin. Augment-and-conquer negative binomial processes. In Advances in Neural Information Processing Systems, pages 2546–2554, 2012.
  89. Lognormal and gamma mixed negative binomial regression. In Proceedings of International Conference on Machine Learning, volume 2012, page 1343, 2012.
Citations (2)

Summary

We haven't generated a summary for this paper yet.