Papers
Topics
Authors
Recent
Search
2000 character limit reached

Curvature-Based Graph Kernels

Updated 5 February 2026
  • Curvature-based graph kernels are techniques that compare graph structures using discrete Ollivier–Ricci curvature derived from optimal transport on local neighborhoods.
  • They summarize edge curvatures into histograms and employ Gaussian RBF kernels, enabling effective classification of unlabeled graphs.
  • Random sampling and optimized transport methods ensure scalability while achieving competitive accuracy on benchmarks like MUTAG and PTC.

Curvature-based graph kernels provide a topology-driven measure for graph comparison and classification by leveraging distributions of discrete Ricci curvature over edges. This approach, as introduced by Liu et al. (2020), employs the Ollivier–Ricci curvature, computed via optimal transport on local neighborhoods, to generate global vector representations of graphs suitable for use with kernel methods such as @@@@1@@@@. The signature is purely topological, making it effective in settings without node or edge attributes, and is particularly competitive on benchmarks of unlabeled graphs (Liu et al., 2019).

1. Mathematical Foundations

Let G=(V,E)G=(V,E) denote a graph, possibly weighted or unweighted. For each edge e=(u,v)Ee=(u,v) \in E, the construction is predicated on comparing the neighborhoods of uu and vv using optimal transport.

1.1. Neighborhood Mass Distributions

For a node uu, define its 1-hop neighborhood as N(u)N(u). Construct a probability distribution mum_u over N(u)N(u), commonly uniform:

mu(x)={1/deg(u)if xN(u) 0otherwisem_u(x) = \begin{cases} 1/\deg(u) & \text{if } x \in N(u) \ 0 & \text{otherwise} \end{cases}

For weighted graphs, mu(x)m_u(x) can be proportional to weight(u,x)\text{weight}(u,x).

1.2. Earth Mover's Distance (EMD)

The distance metric between mum_u and mvm_v is the Earth Mover’s Distance (Wasserstein-1):

W(u,v)=minξ0xN(u)yN(v)ξ(x,y)dG(x,y)W(u,v) = \min_{\xi \geq 0} \sum_{x \in N(u)} \sum_{y \in N(v)} \xi(x,y) \cdot d_G(x,y)

subject to

yξ(x,y)=mu(x)xN(u), xξ(x,y)=mv(y)yN(v)\sum_{y} \xi(x,y) = m_u(x) \quad \forall x \in N(u), \ \sum_{x} \xi(x,y) = m_v(y) \quad \forall y \in N(v)

where dG(x,y)d_G(x,y) is the graph shortest-path metric.

1.3. Ollivier–Ricci Curvature

The edge curvature is defined as

κ(u,v)=1W(u,v)dG(u,v)\kappa(u,v) = 1 - \frac{W(u,v)}{d_G(u,v)}

For unweighted graphs, dG(u,v)=1d_G(u,v)=1, giving κ(u,v)=1W(u,v)\kappa(u,v)=1-W(u,v). Intuitively:

  • κ(u,v)>0\kappa(u,v)>0 indicates well-connected local neighborhoods.
  • κ(u,v)<0\kappa(u,v)<0 reveals edges acting as bridges between poorly connected regions.

2. Kernel Construction Methodology

2.1. Curvature Distribution Vector

The set {κ(e):eE(G)}\{\kappa(e):e \in E(G)\} forms the multiset of edge curvatures for GG. This is summarized as a histogram D(G)=(D1,,Db)D(G) = (D_1,\, …,\, D_b) by partitioning the range [κmin,κmax][1,1][\kappa_{\min},\kappa_{\max}] \subseteq [-1,1] into bb bins:

Di(G)={e:κ(e) lies in bin i}E(G)D_i(G) = \frac{|\,\{e : \kappa(e) \text{ lies in bin }i\}\,|}{|E(G)|}

Alternate representations include storing the sorted κ(e)\kappa(e) or using higher-dimensional histograms of edge pairs for improved expressiveness.

2.2. Gaussian RBF Kernel

Given two graphs GG and HH with curvature histograms D(G),D(H)RbD(G), D(H) \in \mathbb{R}^b, similarity is quantified by the Gaussian RBF kernel:

K(G,H)=exp(12σ2i=1b(Di(G)Di(H))2)K(G,H) = \exp \Bigl( - \frac{1}{2 \sigma^2} \sum_{i=1}^b \big(D_i(G) - D_i(H)\big)^2 \Bigr )

where σ>0\sigma > 0 is a bandwidth parameter. This kernel is positive definite and directly compatible with support vector machine frameworks.

3. Computational and Scaling Considerations

3.1. Exact Algorithmic Workflow

For every edge e=(u,v)e=(u,v) in GG:

  • Construct mum_u and mvm_v over N(u)N(u), N(v)N(v).
  • Solve the optimal transport linear program (complexity O(n3)O(n^3) with n=N(u)+N(v)n = |N(u)| + |N(v)|) for W(u,v)W(u,v).
  • Compute κ(u,v)\kappa(u,v) and incorporate into D(G)D(G).

The full graph computation thus scales as O(E(G)n3)O(|E(G)| \cdot n^3) in time and O(E(G))O(|E(G)|) in memory.

3.2. Random Sampling for Scalability

For large graphs, instead of exhaustive calculation, uniformly sample S=O((1/ϵ2)log(1/δ))S=O((1/\epsilon^2)\log(1/\delta)) edges. The empirical histogram D^(G)\widehat{D}(G) over sampled edges achieves

D^(G)D(G)ϵ\|\widehat{D}(G) - D(G)\|_\infty \leq \epsilon

with probability at least 1δ1 - \delta. The necessary SS is independent of E(G)|E(G)|, facilitating analysis of graphs with billions of edges in O(Sn3)O(S n^3) time.

3.3. Kernel Matrix Computation

Once the histograms D(Gi)D(G_i) are available for a set of NN graphs, the full kernel matrix [K(Gi,Gj)][K(G_i, G_j)] is computed in O(N2b)O(N^2 b) operations, efficient for NN up to several thousand.

4. Empirical Evaluation

4.1. Datasets

Experiments utilize:

  • Synthetic graphs (Erdős–Rényi, Barabási–Albert generators)
  • Bioinformatics benchmarks (MUTAG, PTC, etc.)
  • Internet AS-level topologies

4.2. Baseline Methods

The curvature-based kernel is compared against common kernels based on random walks, shortest-path distances, subtree patterns, and simple cycles.

4.3. Performance Metrics

Empirical evaluation focuses on:

  • Classification accuracy (SVMs trained on kernel matrices)
  • Running time and ability to process large numbers of edges

4.4. Results and Observations

  • Comparable or superior classification accuracy is observed on unlabeled benchmarks.
  • The distribution of curvature values distinguishes graph structural features, specifically high prevalence of negative curvature in bridge-dominated structures versus positive curvature in densely interconnected regions.
  • Random sampling permits scaling to graphs with millions of edges in minutes, where traditional methods typically become computationally infeasible.
  • Enriching the feature set with 2D histograms of curvature on adjacent edge pairs further improves discriminative power on datasets such as MUTAG.

5. Conclusions and Limitations

Summary of Approach

  • Ollivier–Ricci curvature provides a compact, purely topological signature for graph structure.
  • The curvature kernel is positive definite by construction, compatible with standard kernel methods.
  • Sampling theory provides guarantees for approximating the true curvature distribution efficiently.

Identified Limitations

  • The method does not utilize node or edge labels; only topology is encoded.
  • Statistical stability of curvature distributions deteriorates for very small graphs.
  • The run time of the exact EMD calculation is prohibitive for high-degree neighborhoods; reliance on sampling mitigates this.
  • Kernel parameters (bin count bb, range [κmin,κmax][\kappa_{\min},\kappa_{\max}], and RBF bandwidth σ\sigma) require tuning.

A plausible implication is that this methodology is best suited for unlabeled, medium-to-large graphs where structural topology alone is discriminative (Liu et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curvature-based Graph Kernels.