Curvature-Based Graph Kernels
- Curvature-based graph kernels are techniques that compare graph structures using discrete Ollivier–Ricci curvature derived from optimal transport on local neighborhoods.
- They summarize edge curvatures into histograms and employ Gaussian RBF kernels, enabling effective classification of unlabeled graphs.
- Random sampling and optimized transport methods ensure scalability while achieving competitive accuracy on benchmarks like MUTAG and PTC.
Curvature-based graph kernels provide a topology-driven measure for graph comparison and classification by leveraging distributions of discrete Ricci curvature over edges. This approach, as introduced by Liu et al. (2020), employs the Ollivier–Ricci curvature, computed via optimal transport on local neighborhoods, to generate global vector representations of graphs suitable for use with kernel methods such as @@@@1@@@@. The signature is purely topological, making it effective in settings without node or edge attributes, and is particularly competitive on benchmarks of unlabeled graphs (Liu et al., 2019).
1. Mathematical Foundations
Let denote a graph, possibly weighted or unweighted. For each edge , the construction is predicated on comparing the neighborhoods of and using optimal transport.
1.1. Neighborhood Mass Distributions
For a node , define its 1-hop neighborhood as . Construct a probability distribution over , commonly uniform:
For weighted graphs, can be proportional to .
1.2. Earth Mover's Distance (EMD)
The distance metric between and is the Earth Mover’s Distance (Wasserstein-1):
subject to
where is the graph shortest-path metric.
1.3. Ollivier–Ricci Curvature
The edge curvature is defined as
For unweighted graphs, , giving . Intuitively:
- indicates well-connected local neighborhoods.
- reveals edges acting as bridges between poorly connected regions.
2. Kernel Construction Methodology
2.1. Curvature Distribution Vector
The set forms the multiset of edge curvatures for . This is summarized as a histogram by partitioning the range into bins:
Alternate representations include storing the sorted or using higher-dimensional histograms of edge pairs for improved expressiveness.
2.2. Gaussian RBF Kernel
Given two graphs and with curvature histograms , similarity is quantified by the Gaussian RBF kernel:
where is a bandwidth parameter. This kernel is positive definite and directly compatible with support vector machine frameworks.
3. Computational and Scaling Considerations
3.1. Exact Algorithmic Workflow
For every edge in :
- Construct and over , .
- Solve the optimal transport linear program (complexity with ) for .
- Compute and incorporate into .
The full graph computation thus scales as in time and in memory.
3.2. Random Sampling for Scalability
For large graphs, instead of exhaustive calculation, uniformly sample edges. The empirical histogram over sampled edges achieves
with probability at least . The necessary is independent of , facilitating analysis of graphs with billions of edges in time.
3.3. Kernel Matrix Computation
Once the histograms are available for a set of graphs, the full kernel matrix is computed in operations, efficient for up to several thousand.
4. Empirical Evaluation
4.1. Datasets
Experiments utilize:
- Synthetic graphs (Erdős–Rényi, Barabási–Albert generators)
- Bioinformatics benchmarks (MUTAG, PTC, etc.)
- Internet AS-level topologies
4.2. Baseline Methods
The curvature-based kernel is compared against common kernels based on random walks, shortest-path distances, subtree patterns, and simple cycles.
4.3. Performance Metrics
Empirical evaluation focuses on:
- Classification accuracy (SVMs trained on kernel matrices)
- Running time and ability to process large numbers of edges
4.4. Results and Observations
- Comparable or superior classification accuracy is observed on unlabeled benchmarks.
- The distribution of curvature values distinguishes graph structural features, specifically high prevalence of negative curvature in bridge-dominated structures versus positive curvature in densely interconnected regions.
- Random sampling permits scaling to graphs with millions of edges in minutes, where traditional methods typically become computationally infeasible.
- Enriching the feature set with 2D histograms of curvature on adjacent edge pairs further improves discriminative power on datasets such as MUTAG.
5. Conclusions and Limitations
Summary of Approach
- Ollivier–Ricci curvature provides a compact, purely topological signature for graph structure.
- The curvature kernel is positive definite by construction, compatible with standard kernel methods.
- Sampling theory provides guarantees for approximating the true curvature distribution efficiently.
Identified Limitations
- The method does not utilize node or edge labels; only topology is encoded.
- Statistical stability of curvature distributions deteriorates for very small graphs.
- The run time of the exact EMD calculation is prohibitive for high-degree neighborhoods; reliance on sampling mitigates this.
- Kernel parameters (bin count , range , and RBF bandwidth ) require tuning.
A plausible implication is that this methodology is best suited for unlabeled, medium-to-large graphs where structural topology alone is discriminative (Liu et al., 2019).