Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional Data (2401.16497v3)

Published 29 Jan 2024 in cs.LG

Abstract: Extracting meaningful information from high-dimensional data poses a formidable modeling challenge, particularly when the data is obscured by noise or represented through different modalities. This research proposes a novel non-parametric modeling approach, leveraging the Gaussian process (GP), to characterize high-dimensional data by mapping it to a latent low-dimensional manifold. This model, named the latent discriminative generative decoder (LDGD), employs both the data and associated labels in the manifold discovery process. We derive a Bayesian solution to infer the latent variables, allowing LDGD to effectively capture inherent stochasticity in the data. We demonstrate applications of LDGD on both synthetic and benchmark datasets. Not only does LDGD infer the manifold accurately, but its accuracy in predicting data points' labels surpasses state-of-the-art approaches. In the development of LDGD, we have incorporated inducing points to reduce the computational complexity of Gaussian processes for large datasets, enabling batch training for enhanced efficient processing and scalability. Additionally, we show that LDGD can robustly infer manifold and precisely predict labels for scenarios in that data size is limited, demonstrating its capability to efficiently characterize high-dimensional data with limited samples. These collective attributes highlight the importance of developing non-parametric modeling approaches to analyze high-dimensional data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Handbook of mathematical functions with formulas, graphs, and mathematical tables, 1988.
  2. Linear discriminant analysis-a brief tutorial. Institute for Signal and information Processing, 18(1998):1–8, 1998.
  3. Generalized discriminant analysis using a kernel approach. Neural computation, 12(10):2385–2404, 2000.
  4. Analysis of multiphase flows using dual-energy gamma densitometry and neural networks. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 327(2-3):580–593, 1993.
  5. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.
  6. Principal component analysis. Analytical methods, 6(9):2812–2831, 2014.
  7. t-distributed stochastic neighbor embedding (t-sne): A tool for eco-physiological transcriptomic analysis. Marine genomics, 51:100723, 2020.
  8. Dimensionality reduction for large-scale neural recordings. Nature neuroscience, 17(11):1500–1509, 2014.
  9. Shared Gaussian process latent variable models. PhD thesis, Oxford Brookes University Oxford, UK, 2009.
  10. Supervised gaussian process latent variable model for dimensionality reduction. IEEE transactions on systems, man, and cybernetics, Part B (Cybernetics), 41(2):425–434, 2010.
  11. Markov chain Monte Carlo in practice. CRC press, 1995.
  12. Targeted dimensionality reduction enables reliable estimation of neural population coding accuracy from trial-limited data. PloS one, 17(7):e0271136, 2022.
  13. Gaussian processes for big data. arXiv preprint arXiv:1309.6835, 2013.
  14. Scalable variational gaussian process classification. In Artificial Intelligence and Statistics, pages 351–360. PMLR, 2015.
  15. Supervised latent linear gaussian process latent variable model for dimensionality reduction. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(6):1620–1632, 2012.
  16. Neil Lawrence. Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information processing systems, 16, 2003.
  17. Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of machine learning research, 6(11), 2005.
  18. Fast sparse gaussian process methods: The informative vector machine. Advances in neural information processing systems, 15, 2002.
  19. Hybrid approaches and dimensionality reduction for portfolio selection with cardinality constraints. IEEE Computational Intelligence Magazine, 5(2):92–107, 2010.
  20. Doubly stochastic variational inference for deep gaussian processes. Advances in neural information processing systems, 30, 2017.
  21. Matthias Seeger. Gaussian processes for machine learning. International journal of neural systems, 14(02):69–106, 2004.
  22. Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 61(3):611–622, 1999.
  23. Michalis Titsias. Variational learning of inducing variables in sparse gaussian processes. In Artificial intelligence and statistics, pages 567–574. PMLR, 2009.
  24. Bayesian gaussian process latent variable model. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 844–851. JMLR Workshop and Conference Proceedings, 2010.
  25. Discriminative gaussian process latent variable model for classification. In Proceedings of the 24th international conference on Machine learning, pages 927–934, 2007.
  26. Generalized autoencoder: A neural network framework for dimensionality reduction. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 490–497, 2014.
  27. Supervised probabilistic principal component analysis. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 464–473, 2006.
  28. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends, 1(2):56–70, 2020.
  29. Supervised gaussian process latent variable model based on gaussian mixture model. In 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), pages 124–129. IEEE, 2017.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets