Self-supervised Riemannian GNNs
- SelfRGNN is a graph neural network that embeds nodes in Riemannian manifolds with dynamic, learnable curvature, effectively capturing diverse graph structures.
- It integrates temporal and structural encoding with modules for curvature estimation and Riemannian message passing to enhance geometric expressivity.
- Self-supervised objectives and contrastive mechanisms optimize node representations without external labels, improving performance on evolving or heterogeneous graphs.
A self-supervised Riemannian Graph Neural Network (SelfRGNN) is a class of GNN architectures designed to learn node representations on graphs by embedding them in Riemannian manifolds, particularly with learnable, possibly time-varying or heterogeneous curvature, and optimizing objectives without requiring external labels. These models generalize classical Euclidean GNNs and fixed-curvature hyperbolic/spherical GNNs, addressing limitations in geometric expressivity and data efficiency, especially for temporal or structurally diverse graphs. Several instantiations exist under this paradigm, including temporal curvature dynamics (Sun et al., 2022), mixed-curvature product spaces (Sun et al., 2021), motif-aware generative-contrastive approaches (Sun et al., 2024), and Ricci curvature–driven co-refinement techniques (Sun et al., 2024).
1. Riemannian Manifolds and Curvature in Graph Embedding
SelfRGNNs employ Riemannian manifolds as embedding spaces, parametrized by curvature . Specific instantiations include constant-curvature spaces (hyperbolic: , Euclidean: , spherical: ) and, in some frameworks, product manifolds of varying curvature factors to model heterogeneous graph regions. Embedding nodes in these manifolds enables the capture of hierarchical, cyclical, or bottleneck patterns, which manifest as negative, positive, or mixed curvature at different scales or epochs.
Crucially, recent models allow the curvature to be learned and adapted per time step (temporal graphs) (Sun et al., 2022), per component (mixed-curvature) (Sun et al., 2021), per motif (Sun et al., 2024), or per edge/region (heterogeneous curvature) (Sun et al., 2024). This flexibility is essential for representing nonuniform or evolving graph geometries, in contrast to earlier models constrained to single, static .
2. Architectural Components of Self-supervised Riemannian GNNs
A prototypical SelfRGNN processes a graph as follows (Sun et al., 2022, Sun et al., 2021, Sun et al., 2024, Sun et al., 2024):
- Time or Structural Encoding: Time points or structural factors are encoded using translation-invariant schemes, often leveraging random Fourier features for temporal graphs (e.g., using sinusoidal bases) (Sun et al., 2022) or gyrovector Fourier features for product-manifold factors (Sun et al., 2024, Sun et al., 2024).
- Curvature Module: A neural network (e.g., "CurNN" or similar) maps encodings to functional curvature(s), e.g., (Sun et al., 2022), providing curvature values per time or graph region.
- Graph Convolution in Riemannian Spaces: Riemannian graph convolutions generalize Euclidean message passing to curved contexts using operation sets—exponential/logarithm maps for transporting between tangent and manifold, Riemannian attention for weighted aggregation, and gyrovector operations for addition and transformation (Sun et al., 2022, Sun et al., 2021).
- Numerical Stability: Certain variants replace unstable operations with gyrovector kernel mappings (e.g., using random Fourier–style features in each manifold factor, maintaining isometry invariance and preventing numerical blow-up for large 0) (Sun et al., 2024, Sun et al., 2024).
- Feature and Structure Co-Refinement: Some architectures, such as DeepRicci, couple feature learning with structure learning via differentiable, Ricci curvature–aware updates and backward Ricci flow to alleviate over-squashing (Sun et al., 2024).
| Layer/Module | Core Mechanism | Curvature Adaptation |
|---|---|---|
| Time/Structural Encoder | Random Fourier, gyrovector or Eucl. features | Per-time/factor |
| Curvature Module | MLP/Bilinear maps, GRUs | Time-varying or local |
| Riemannian GNN Layer | Manifold-aware conv/attention | Contextual per layer/factor |
| Kernel Mapping (optional) | Fourier/gyrovector kernels | Numerically stable |
| Ricci/Curvature Co-Refinement | Differentiable Ricci est. + Ricci flow | Heterogeneous (per edge) |
3. Self-supervised Objectives and Contrastive Mechanisms
SelfRGNN frameworks eschew label supervision in favor of contrastive and generative objectives tailored to the Riemannian context:
- Self-Contrastive Learning: Embeddings from different time points or manifold curvatures are projected (via exp/log maps or Lorentz projections) into shared spaces, treating alternate curvatures or timestamps as positive pairs and reweighting negatives based on geometric similarity (Sun et al., 2022, Sun et al., 2022). Adversarial motif-aware contrastive losses regularize learning to emphasize hard positives/negatives (Sun et al., 2024).
- Dual/Hierarchical Contrast: Mixed-curvature models contrast representations across the canonical hyperbolic, spherical, and Euclidean "views," utilizing Riemannian discriminators and projectors (Sun et al., 2021).
- Ricci-based Curvature Regularization: Edge-based Ricci curvature (often Ollivier–Ricci) is computed in the embedding geometry, then empirically regularized to agree with the functional/global curvature(s) learned by the model—via sequence modeling (e.g., GRUs) and explicit curvature losses (Sun et al., 2022, Sun et al., 2024).
- Motif-aware Generative Games: MotifRGC introduces a GAN-like generator-discriminator min-max game in Riemannian product space, generating fake motifs and regularizing both the manifold and node embeddings to respect observed motif structure (Sun et al., 2024).
In all these cases, the fundamental aim is to structure the embedding space (and the learned curvature) so as to maximize mutual information, structural motif consistency, or agreement with Ricci curvature—all without access to labels.
4. Riemannian Tools: Manifold Operations and Product Manifolds
SelfRGNNs extend classical GNN operations to Riemannian settings through a unified notation for:
- Exponential and Logarithm Maps: Moves between the manifold and tangent spaces at the origin or arbitrary points, allowing consistent transformations when mixing representations of different curvatures or time steps (Sun et al., 2022, Sun et al., 2022).
- Distance and Attention Calculations: Riemannian/geodesic distances characterize similarity, attention, and negative sampling, with manifold-aware kernels ensuring invariance (Sun et al., 2022, Sun et al., 2024).
- Product Manifolds: Multi-factor mixed-curvature product manifolds are constructed by Cartesian products of constant-curvature spaces; the squared product metric is the sum of per-factor squared distances (Sun et al., 2021, Sun et al., 2024, Sun et al., 2024).
- Numerically Stable Kernelization: Random gyrovector Fourier features are used to build isometry-invariant kernel representations of manifold points, eliminating the need for unstable 1 operations and facilitating stable message passing (Sun et al., 2024, Sun et al., 2024).
- Backward Ricci Flow: By iteratively modifying the edge structure based on learned Ricci curvatures, bottlenecks are widened, mitigating over-squashing (Sun et al., 2024).
5. Empirical Validation and Theoretical Results
Empirical studies across multiple benchmarks (e.g., Wikipedia, MOOC, Cora, Citeseer, Chameleon, Squirrel, Airport) demonstrate that SelfRGNNs outperform both Euclidean and fixed-curvature Riemannian GNNs—frequently by 1–8 percentage points in classification or link prediction AUC (Sun et al., 2022, Sun et al., 2021, Sun et al., 2024, Sun et al., 2024). Key findings include:
- Time-varying curvature or mixed-curvature product spaces consistently outperform single constant-curvature models (Sun et al., 2022, Sun et al., 2021).
- Numerically stable kernel-based architectures prevent NaN/overflow errors observed in high-curvature 2 approaches (Sun et al., 2024, Sun et al., 2024).
- Ricci-based backward flow demonstrably alleviates over-squashing, with spectral and Cheeger constant analyses confirming increased graph conductance (Sun et al., 2024).
- Theoretical propositions establish closed-form exp/log maps under unified notation, translation invariance of time-encoding kernels, and equivalence of contrastive objectives to InfoNCE in suitable limits (Sun et al., 2022).
- Case studies (e.g., Physics citation network, ogbn-arXiv) reveal that learned curvatures track known temporal or task-driven geometric transitions (e.g., from spherical to hyperbolic) (Sun et al., 2022, Sun et al., 2022).
- Ablations indicate that both geometry-aware curvature adaptation and curvature-driven contrastive mechanisms are necessary for top performance; fixed-curvature or Euclidean variants are consistently outperformed (Sun et al., 2022, Sun et al., 2022, Sun et al., 2024).
6. Relationship to Related Self-supervised and Riemannian GNN Approaches
The SelfRGNN paradigm encompasses, generalizes, or complements several lines of research:
- SelfMGNN (Sun et al., 2021): Introduces mixed-curvature product manifolds and dual-view self-supervised learning, focusing on static graphs but handling diverse geometric regions via hierarchical attention and Riemannian projectors.
- RieGrace (Sun et al., 2022): Targets continual graph learning with adaptive curvature, neural curvature adapters (CurvNet), and label-free Lorentzian distillation in Riemannian space; supports task-sequential graphs.
- MotifRGC (Sun et al., 2024): Combines motif-level generative adversarial learning with motif-aware contrastive objectives in a diverse-curvature product manifold, employing numerically stable gyrovector kernels.
- DeepRicci (Sun et al., 2024): Applies product manifolds and gyrovector mapping to self-supervised graph structure learning, operationalizing differentiable Ricci curvature and backward Ricci flow for structure-feature co-refinement and over-squashing mitigation.
These frameworks share the core principles of (i) learning or adapting curvature in the embedding space; (ii) using Riemannian geometry to better represent graph structure; and (iii) employing self-supervised, typically contrastive, objectives—often enhanced by curvature- or motif-aware regularization. They collectively mark a shift from static, Euclidean or uniformly curved models to a more expressive, geometry-adaptive, and label-efficient regime for graph neural networks.