Curvature-Based Graph Kernels

Updated 5 February 2026

Curvature-based graph kernels are techniques that compare graph structures using discrete Ollivier–Ricci curvature derived from optimal transport on local neighborhoods.
They summarize edge curvatures into histograms and employ Gaussian RBF kernels, enabling effective classification of unlabeled graphs.
Random sampling and optimized transport methods ensure scalability while achieving competitive accuracy on benchmarks like MUTAG and PTC.

Curvature-based graph kernels provide a topology-driven measure for graph comparison and classification by leveraging distributions of discrete Ricci curvature over edges. This approach, as introduced by Liu et al. (2020), employs the Ollivier–Ricci curvature, computed via optimal transport on local neighborhoods, to generate global vector representations of graphs suitable for use with kernel methods such as @@@@1@@@@. The signature is purely topological, making it effective in settings without node or edge attributes, and is particularly competitive on benchmarks of unlabeled graphs (Liu et al., 2019).

1. Mathematical Foundations

Let $G=(V,E)$ denote a graph, possibly weighted or unweighted. For each edge $e=(u,v) \in E$ , the construction is predicated on comparing the neighborhoods of $u$ and $v$ using optimal transport.

1.1. Neighborhood Mass Distributions

For a node $u$ , define its 1-hop neighborhood as $N(u)$ . Construct a probability distribution $m_u$ over $N(u)$ , commonly uniform:

$m_u(x) = \begin{cases} 1/\deg(u) & \text{if } x \in N(u) \ 0 & \text{otherwise} \end{cases}$

For weighted graphs, $m_u(x)$ can be proportional to $\text{weight}(u,x)$ .

1.2. Earth Mover's Distance (EMD)

The distance metric between $m_u$ and $m_v$ is the Earth Mover’s Distance (Wasserstein-1):

$W(u,v) = \min_{\xi \geq 0} \sum_{x \in N(u)} \sum_{y \in N(v)} \xi(x,y) \cdot d_G(x,y)$

subject to

$\sum_{y} \xi(x,y) = m_u(x) \quad \forall x \in N(u), \ \sum_{x} \xi(x,y) = m_v(y) \quad \forall y \in N(v)$

where $d_G(x,y)$ is the graph shortest-path metric.

1.3. Ollivier–Ricci Curvature

The edge curvature is defined as

$\kappa(u,v) = 1 - \frac{W(u,v)}{d_G(u,v)}$

For unweighted graphs, $d_G(u,v)=1$ , giving $\kappa(u,v)=1-W(u,v)$ . Intuitively:

$\kappa(u,v)>0$ indicates well-connected local neighborhoods.
$\kappa(u,v)<0$ reveals edges acting as bridges between poorly connected regions.

2. Kernel Construction Methodology

2.1. Curvature Distribution Vector

The set $\{\kappa(e):e \in E(G)\}$ forms the multiset of edge curvatures for $G$ . This is summarized as a histogram $D(G) = (D_1,\, …,\, D_b)$ by partitioning the range $[\kappa_{\min},\kappa_{\max}] \subseteq [-1,1]$ into $b$ bins:

$D_i(G) = \frac{|\,\{e : \kappa(e) \text{ lies in bin }i\}\,|}{|E(G)|}$

Alternate representations include storing the sorted $\kappa(e)$ or using higher-dimensional histograms of edge pairs for improved expressiveness.

2.2. Gaussian RBF Kernel

Given two graphs $G$ and $H$ with curvature histograms $D(G), D(H) \in \mathbb{R}^b$ , similarity is quantified by the Gaussian RBF kernel:

$K(G,H) = \exp \Bigl( - \frac{1}{2 \sigma^2} \sum_{i=1}^b \big(D_i(G) - D_i(H)\big)^2 \Bigr )$

where $\sigma > 0$ is a bandwidth parameter. This kernel is positive definite and directly compatible with support vector machine frameworks.

3. Computational and Scaling Considerations

3.1. Exact Algorithmic Workflow

For every edge $e=(u,v)$ in $G$ :

Construct $m_u$ and $m_v$ over $N(u)$ , $N(v)$ .
Solve the optimal transport linear program (complexity $O(n^3)$ with $n = |N(u)| + |N(v)|$ ) for $W(u,v)$ .
Compute $\kappa(u,v)$ and incorporate into $D(G)$ .

The full graph computation thus scales as $O(|E(G)| \cdot n^3)$ in time and $O(|E(G)|)$ in memory.

3.2. Random Sampling for Scalability

For large graphs, instead of exhaustive calculation, uniformly sample $S=O((1/\epsilon^2)\log(1/\delta))$ edges. The empirical histogram $\widehat{D}(G)$ over sampled edges achieves

$\|\widehat{D}(G) - D(G)\|_\infty \leq \epsilon$

with probability at least $1 - \delta$ . The necessary $S$ is independent of $|E(G)|$ , facilitating analysis of graphs with billions of edges in $O(S n^3)$ time.

3.3. Kernel Matrix Computation

Once the histograms $D(G_i)$ are available for a set of $N$ graphs, the full kernel matrix $[K(G_i, G_j)]$ is computed in $O(N^2 b)$ operations, efficient for $N$ up to several thousand.

4. Empirical Evaluation

4.1. Datasets

Experiments utilize:

Synthetic graphs (Erdős–Rényi, Barabási–Albert generators)
Bioinformatics benchmarks (MUTAG, PTC, etc.)
Internet AS-level topologies

4.2. Baseline Methods

The curvature-based kernel is compared against common kernels based on random walks, shortest-path distances, subtree patterns, and simple cycles.

4.3. Performance Metrics

Empirical evaluation focuses on:

Classification accuracy (SVMs trained on kernel matrices)
Running time and ability to process large numbers of edges

4.4. Results and Observations

Comparable or superior classification accuracy is observed on unlabeled benchmarks.
The distribution of curvature values distinguishes graph structural features, specifically high prevalence of negative curvature in bridge-dominated structures versus positive curvature in densely interconnected regions.
Random sampling permits scaling to graphs with millions of edges in minutes, where traditional methods typically become computationally infeasible.
Enriching the feature set with 2D histograms of curvature on adjacent edge pairs further improves discriminative power on datasets such as MUTAG.

5. Conclusions and Limitations

Summary of Approach

Ollivier–Ricci curvature provides a compact, purely topological signature for graph structure.
The curvature kernel is positive definite by construction, compatible with standard kernel methods.
Sampling theory provides guarantees for approximating the true curvature distribution efficiently.

Identified Limitations

The method does not utilize node or edge labels; only topology is encoded.
Statistical stability of curvature distributions deteriorates for very small graphs.
The run time of the exact EMD calculation is prohibitive for high-degree neighborhoods; reliance on sampling mitigates this.
Kernel parameters (bin count $b$ , range $[\kappa_{\min},\kappa_{\max}]$ , and RBF bandwidth $\sigma$ ) require tuning.

A plausible implication is that this methodology is best suited for unlabeled, medium-to-large graphs where structural topology alone is discriminative (Liu et al., 2019).

Markdown Report Issue Upgrade to Chat

References (1)

Topology Based Scalable Graph Kernels (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curvature-based Graph Kernels.