Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sketching the Heat Kernel: Using Gaussian Processes to Embed Data (2403.07929v1)

Published 1 Mar 2024 in cs.LG, cs.NA, math.NA, and stat.ML

Abstract: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian process to be the heat kernel, and computing the embedding amounts to sketching a matrix representing the heat kernel. The Karhunen-Lo`eve expansion reveals that the straight-line distances in the embedding approximate the diffusion distance in a probabilistic sense, avoiding the need for sharp cutoffs and maintaining some of the smaller-scale structure. Our method demonstrates further advantage in its robustness to outliers. We justify the approach with both theory and experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Convergence of the reach for a sequence of Gaussian-embedded manifolds. Probab. Theory Related Fields, 171(3-4):1045–1091, 2018.
  2. Random fields and geometry, volume 80. Springer, 2007.
  3. Jonathan Bates. The embedding dimension of Laplacian eigenfunction maps. Appl. Comput. Harmon. Anal., 37(3):516–530, 2014.
  4. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
  5. Embedding Riemannian manifolds by their heat kernel. Geom. Funct. Anal., 4(4):373–398, 1994.
  6. Diffusion maps for changing data. Appl. Comput. Harmon. Anal., 36(1):79–107, 2014.
  7. Diffusion maps. Appl. Comput. Harmon. Anal., 21(1):5–30, 2006.
  8. On the convergence rate of sinkhorn’s algorithm, 2022.
  9. Universal local parametrizations via heat kernels and eigenfunctions of the Laplacian. Ann. Acad. Sci. Fenn. Math., 35(1):131–174, 2010.
  10. Philip A. Knight. The Sinkhorn-Knopp algorithm: convergence and applications. SIAM J. Matrix Anal. Appl., 30(1):261–275, 2008.
  11. The intrinsic geometry of some random manifolds. Electron. Commun. Probab., 22:Paper No. 1, 12, 2017.
  12. Stephane S. Lafon. Diffusion maps and geometric harmonics. ProQuest LLC, Ann Arbor, MI, 2004. Thesis (Ph.D.)–Yale University.
  13. Doubly stochastic normalization of the Gaussian kernel is robust to heteroskedastic noise. SIAM J. Math. Data Sci., 3(1):388–413, 2021.
  14. Spectral methods for uncertainty quantification. Scientific Computation. Springer, New York, 2010. With applications to computational fluid dynamics.
  15. Probability in Banach spaces. Classics in Mathematics. Springer-Verlag, Berlin, 2011. Isoperimetry and processes, Reprint of the 1991 edition.
  16. John M. Lee. Introduction to smooth manifolds, volume 218 of Graduate Texts in Mathematics. Springer, New York, second edition, 2013.
  17. Manifold learning with bi-stochastic kernels. IMA J. Appl. Math., 84(3):455–482, 2019.
  18. Per-Gunnar Martinsson. Randomized methods for matrix computations, 2019.
  19. Stephen Semmes. On the nonexistence of bi-Lipschitz parameterizations and geometric problems about A∞subscript𝐴A_{\infty}italic_A start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT-weights. Rev. Mat. Iberoamericana, 12(2):337–410, 1996.
  20. K. T. Sturm. Diffusion processes and heat kernels on metric spaces. Ann. Probab., 26(1):1–55, 1998.
  21. Roman Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018. An introduction with applications in data science, With a foreword by Sara van de Geer.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com