Heat Kernel Embedding

Updated 16 April 2026

Heat kernel embedding is a data representation technique that maps points from geometric spaces to Euclidean spaces using heat diffusion, preserving intrinsic distances.
It leverages spectral decomposition and Laplacian eigenfunctions to construct nearly isometric embeddings that capture both local and global geometric features.
Applications include manifold learning, clustering, and topological data analysis, with extensions to graphs and hypergraphs for scalable and robust analysis.

A heat kernel embedding is a geometric data representation originating from spectral and probabilistic analysis on manifolds, graphs, and higher-order complexes, which maps the original space into a Hilbert or Euclidean space via transformations derived from the heat equation or its discrete analogs. The core principle is that the propagation of “heat” or diffusion encodes intrinsic geometric, topological, and often multi-scale information about the underlying structure, enabling robust, nearly isometric, and highly expressive data embeddings. Applications range from manifold learning and clustering to topological and machine learning tasks.

1. Mathematical Foundations and Definitions

Let $(M,g)$ be a compact $n$ -dimensional Riemannian manifold, or its discrete analog (a graph, hypergraph, or combinatorial complex). The heat kernel $K_t(x,y)$ is the fundamental solution to the heat equation: $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ where $\Delta_g$ is the Laplace–Beltrami operator on $M$ (Wang et al., 2013, Huguet et al., 2023).

Spectral expansion: The heat kernel admits a spectral decomposition

$K_t(x,y) = \sum_{i=0}^\infty e^{-\lambda_i t}\, \phi_i(x)\phi_i(y)$

where $\Delta_g \phi_i = \lambda_i \phi_i$ , and $\{\phi_i\}$ forms an $L^2$ -orthonormal eigenbasis.

Embedding map: The canonical (unnormalized) heat kernel embedding is

$n$ 0

and a normalized version is $n$ 1, so that $n$ 2 (Wang et al., 2013, Zhu, 2013).

On graphs, the heat kernel is replaced by the matrix exponential of the (combinatorial or normalized) Laplacian: $n$ 3 with feature embedding for each vertex $n$ 4

$n$ 5

where $n$ 6 and $n$ 7 (Saito, 2022).

2. Geodesic Structure, Isometry, and Small-Time Asymptotics

A key property is that heat kernel embeddings preserve intrinsic (geodesic) distances. Varadhan's formula gives, as $n$ 8,

$n$ 9

where $K_t(x,y)$ 0 is the geodesic distance on $K_t(x,y)$ 1 (Huguet et al., 2023, Zhou et al., 2020). The embedding thus recovers or approximates manifold distances for small diffusion times.

Wang and Zhu constructed an intrinsic perturbation, showing that for sufficiently small $K_t(x,y)$ 2 and dimension $K_t(x,y)$ 3, the truncated heat kernel embedding can be made nearly isometric, and with appropriate correction, exactly isometric, $K_t(x,y)$ 4, satisfying $K_t(x,y)$ 5 (Wang et al., 2013). Quantitative bounds depend on Ricci bounds, injectivity radius, and volume (Portegies, 2013).

3. Spectral, Graph, and Hypergraph Embeddings

In data analysis, one computes the discrete heat kernel on a finite dataset (point cloud, graph, or hypergraph):

Construct a weighted adjacency/kernel matrix $K_t(x,y)$ 6 (often Gaussian or polynomial).
Build the Laplacian $K_t(x,y)$ 7 and compute $K_t(x,y)$ 8.
Perform eigendecomposition and build the embedding from leading eigenpairs (Saito, 2022, Huguet et al., 2023).

Hypergraph settings utilize contraction (such as the star reduction) to build a matrix $K_t(x,y)$ 9 from the incidence matrix and hyperedge weights, then proceed analogously to graphs for Laplacian and heat kernel computation. Embeddings preserve multi-way similarities, mapping frequent co-occurrences in hyperedges to proximity in the embedding (Saito, 2022).

In topological complexes, the Laplacian is generalized via incidence matrices of higher rank, yielding multiscale heat kernels and node descriptors (the "Heat Kernel Signature", HKS) that are both informative and permutation-equivariant (Krahn et al., 16 Jul 2025).

4. Diffusion Distance, Random Sketching, and Robustness

The Euclidean distance in the embedded space approximates the diffusion distance

$\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 0

which measures similarity of the diffusion profiles and is stable under perturbations (Gilbert et al., 2024).

Heat kernel embeddings can be efficiently approximated by Gaussian process sketching, using the heat kernel as the covariance: $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 1 with nonasymptotic distortion bounds and high robustness to kernel perturbations and outliers. The embedding preserves pairwise diffusion distances in expectation, and random sketching enables scalable computation (Gilbert et al., 2024).

5. Algorithmic and Theoretical Guarantees

All heat-kernel-based embeddings admit explicit algorithmic recipes, with approximations via eigenfunction or finite landmark truncations controlled by geometric parameters. On manifolds with Ricci curvature and injectivity bounds, one can select diffusion time $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 2 and the number of landmarks or eigenfunctions $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 3 to guarantee near-isometric, injective, finite-dimensional embeddings (Portegies, 2013, Lin, 2021).

On discrete structures, Chebyshev or backward Euler methods for approximating $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 4 provide practical scalability even for large datasets (Huguet et al., 2023).

The heat kernel embedding is maximally expressive on combinatorial complexes: the Laplacian spectrum and hence the HKS descriptor uniquely characterize non-isomorphic structures. This permits universal discriminability for topological deep learning frameworks and provable separation of structures beyond the 1-WL graph isomorphism hierarchy (Krahn et al., 16 Jul 2025).

6. Applications in Machine Learning, Topological Data Analysis, and Geometry

Heat kernel embeddings have been incorporated into:

Dimensionality reduction and manifold learning (Diffusion Maps, PHATE, SNE/t-SNE analogs) (Huguet et al., 2023, Zhou et al., 2020).
Topological graph and complex classification, via HKS features in attention mechanisms and transformers for molecular property prediction and complex recognition, yielding both superior accuracy and orders-of-magnitude speedup over previous higher-order message passing schemes (Krahn et al., 16 Jul 2025).
Implicit manifold learning, with learned kernels used for unsupervised representation, generative modeling (MMD-GAN, SMMD-GAN), and Bayesian inference (SVGD), utilizing Wasserstein gradient flows to parameterize the heat kernel (Zhou et al., 2020).
Hypergraph clustering by spectral relaxations that reduce to heat kernel embeddings on the contracted Laplacian, enabling multi-way similarity preservation (Saito, 2022).

7. Extensions: Vector Heat Kernel and Connection Laplacian Embeddings

Embeddings via the connection Laplacian and its heat kernel generalize the scalar case, producing the Vector Diffusion Map and associated “vector diffusion distance”. These constructions map points into $\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 5 via inner products of eigenvector fields, capturing both geometry and tangent bundle structure (Wu, 2013, Lin, 2021). Under additional geometric regularity, these embeddings yield nearly isometric finite-dimensional Euclidean embeddings, with explicit quantitative dependence on Ricci bound, injectivity, and volume.

Table: Principal Heat Kernel Embedding Variants

Model/Class	Laplacian/Operator	Embedding Map
Riemannian manifold	Laplace–Beltrami	$\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 6
Graph	Combinatorial/normalized	$\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 7
Hypergraph	Contracted H-Laplacian	$\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 8
Complex (TopoHKS)	Combinatorial Laplacian	$\frac{\partial}{\partial t}\,u(t,y)=\Delta_g u(t,y),\qquad u(0,y)=\delta_x(y)$ 9
Vector Diffusion Map	Connection Laplacian	$\Delta_g$ 0

All approaches leverage the spectral structure of the relevant Laplacian to encode geometric, multi-scale, or topological information. The embedding time parameter $\Delta_g$ 1 allows interpolation between local geometry and global diffusion behavior.

References:

(Wang et al., 2013) Wang–Zhu, "Isometric embeddings via heat kernel"
(Wu, 2013) Lin–Zhou, "Embedding Riemannian Manifolds by the Heat Kernel of the Connection Laplacian"
(Zhu, 2013) Zhu, "High-jet relations of the heat kernel embedding map and applications"
(Portegies, 2013) Portegies, "Embeddings of Riemannian manifolds with heat kernels and eigenfunctions"
(Zhou et al., 2020) Zhou et al., "Learning Manifold Implicitly via Explicit Heat-Kernel Learning"
(Saito, 2022) "Hypergraph Modeling via Spectral Embedding Connection: Hypergraph Cut, Weighted Kernel $\Delta_g$ 2-means, and Heat Kernel"
(Huguet et al., 2023) "A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction"
(Gilbert et al., 2024) "Sketching the Heat Kernel: Using Gaussian Processes to Embed Data"
(Krahn et al., 16 Jul 2025) "Heat Kernel Goes Topological"
(Lin, 2021) Lin, "Manifold embeddings by heat kernels of connection Laplacian"