Papers
Topics
Authors
Recent
Search
2000 character limit reached

Double-Radius Node Labeling (DRNL)

Updated 12 January 2026
  • DRNL is a distance-based, permutation-equivariant node labeling scheme that assigns integer labels based on dual shortest-path distances in an h-hop subgraph.
  • It allows GNNs to capture joint structural representations for link prediction, overcoming the limitations of single-node aggregation with a closed-form labeling function.
  • Empirical evaluations reveal that DRNL-based models outperform baseline methods by up to 15 percentage points on various benchmark datasets.

Double-Radius Node Labeling (DRNL) is a permutation-equivariant, distance-based node labeling scheme originally designed to enhance the expressiveness of graph neural networks (GNNs) for multi-node representation learning, specifically in link prediction tasks. DRNL assigns integer labels to each node in a subgraph centered on a target node pair, encoding each node's relative position with respect to both endpoints. This approach enables a GNN to learn joint structural representations of node sets, bypassing the limitations of single-node aggregation schemes.

1. Labeling Trick Framework and DRNL's Role

The “labeling trick” formalism provides a permutation-equivariant mechanism for distinguishing target nodes from others in multi-node tasks. For any node set SVS \subseteq V, a labeling function %%%%1%%%% must satisfy:

  • Target-nodes-distinguishing: Any permutation that preserves the label assignments must send the target set SS' onto SS.
  • Permutation equivariance: Under re-indexing (i.e., node permutation), the label assignments reorder accordingly.

DRNL is a concrete instantiation for two-node sets (i.e., link prediction scenarios) (Zhang et al., 2020, Wang et al., 2023). It labels nodes in the hh-hop enclosing subgraph around each target link so that any automorphism preserving the link endpoints preserves the label structure. This labeling is instrumental in allowing a GNN to recover a most-expressive (structural) representation for links, as formalized in expressiveness theorems.

2. DRNL Label Assignment: Formal Definition

DRNL first extracts the hh-hop enclosing subgraph G(u,v,h)G_{(u,v,h)} formed by nodes within shortest-path distance hh to either endpoint of the candidate link (u,v)(u,v). For each node xx in this subgraph, DRNL assigns an integer label computed from shortest-path distances:

  • du(x)=dist(x,u)d_u(x) = \mathrm{dist}(x,u) in the graph with vv removed.
  • dv(x)=dist(x,v)d_v(x) = \mathrm{dist}(x,v) in the graph with uu removed.
  • d(x)=du(x)+dv(x)d(x) = d_u(x) + d_v(x); m(x)=min(du(x),dv(x))m(x) = \min(d_u(x), d_v(x)).

The closed-form labeling function is:

y(x)=1+m(x)+d(x)(d(x)+1)2y(x) = 1 + m(x) + \frac{d(x)(d(x)+1)}{2}

Endpoints uu and vv receive label $1$. The function injectively enumerates (du,dv)(d_u, d_v) tuples by increasing dd and by the lesser coordinate, ensuring that topologically equivalent positions yield the same label. Alternative formulae in SEAL (Wang et al., 2023) may use:

DRNL(v)=1+min(dx,dy)+d2(d2+(dmod2)1)\ell_{\rm DRNL}(v) = 1 + \min(d_x, d_y) + \left\lfloor \frac{d}{2} \right\rfloor \left( \left\lfloor \frac{d}{2} \right\rfloor + (d \bmod 2) - 1 \right)

where dxd_x and dyd_y are as above.

3. Subgraph Extraction and Distance Computation

For each link, the hh-hop subgraph is induced. Distance computations utilize the “masking trick”: to obtain du(x)d_u(x), vv (and incident edges) are temporarily deleted; for dv(x)d_v(x), uu is removed. Breadth-first search (BFS) runs from each endpoint yield the required distances for each node in the subgraph. This masking prevents shortcut paths through the other endpoint, ensuring precise topological encoding.

If either du(x)d_u(x) or dv(x)d_v(x) is infinite (i.e., the node lies outside the enclosing subgraph), y(x)y(x) is set to $0$. No tie-breaking is required due to the injective nature of the labeling function. DRNL requires two BFS traversals per link subgraph, giving algorithmic-time complexity O(VH+EH)O(|V_H| + |E_H|) per candidate link, where HH denotes the subgraph.

4. Integration with GNN Workflows

After DRNL labeling, each node’s feature consists of its original attribute concatenated with a learnable embedding (or one-hot encoding) of the integer label y(x)y(x). GNN message-passing proceeds on the labeled subgraph, typically for l>hl > h layers to fully assimilate enclosing structure. Downstream, node representations for endpoints uu and vv are aggregated (e.g., by Hadamard product or concatenation followed by MLP) to yield the link representation. Subgraph-level readouts (such as SortPooling) may further supplement learning neighborhood context.

5. Expressiveness and Theoretical Guarantees

DRNL satisfies the formal requirements of the labeling trick—target-nodes-distinguishing and permutation equivariance—for two-node sets. Theorem 3.2 from (Zhang et al., 2020) and Theorem 4.2/5.1 from (Wang et al., 2023) rigorously establish that, given a node-most-expressive GNN and injective aggregator, any link labeled via DRNL can be mapped to its most-expressive structural representation. This ensures that non-isomorphic links are reliably distinguished and that isomorphic links yield identical representations.

Further, DRNL empowers simple 1-WL-GNNs augmented with zero-one labeling to learn topological heuristics (e.g., common neighbors, Adamic-Adar, resource allocation) unattainable with vanilla 1-WL-GNN architectures (Wang et al., 2023).

6. Comparative Performance and Empirical Evaluation

Empirical studies on small and large benchmarks demonstrate robust performance improvements for link prediction:

Dataset GAE (auroc) SEAL (DRNL)
USAir 89.04% 97.09%
NS 74.10% 97.71%
PB 90.87% 95.01%
Yeast 83.04% 97.20%
C.elegans 73.25% 86.54%

On large-scale OGB datasets:

Dataset Metric SEAL (DRNL)
ogbl-collab Hits@50 54.7%
ogbl-citation2 MRR 87.7%
ogbl-ppa Hits@100 48.8%

DRNL-based models outperform graph auto-encoder baselines by significant margins—often 10–15 percentage points—underscoring the necessity and impact of node-labeling tricks in expressive link prediction workflows.

7. Advantages, Limitations, and Practical Considerations

Advantages

  • Expressiveness: Theoretical guarantees ensure maximal representational power for link prediction, with DRNL plus a sufficiently powerful GNN discriminating all non-isomorphic links (Zhang et al., 2020).
  • Empirical Success: Consistent outperformance over baseline methods on both small and large benchmarks, matching or surpassing alternatives such as Distance Encoding (Wang et al., 2023).
  • Simplicity: The labeling function is closed-form, with only two shortest-path calculations per subgraph node.

Limitations

  • Computational Overhead: Subgraph extraction and dual BFS per link increase cost on very large or dense graphs.
  • Hyperparameter Sensitivity: Optimal choice of hh and ll is dataset-dependent, with hh often set to $1$ or $2$ and ll exceeding hh for best results.
  • Graph Density Sensitivity: DRNL may underperform free-node embedding methods on dense graphs (e.g., ogbl-ddi), possibly due to inductive complexity of dense neighborhood patterns.

A plausible implication is that while DRNL offers state-of-the-art performance for structurally heterogeneous graphs, practitioners may need to evaluate trade-offs on extremely large or dense graph instances.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Double-Radius Node Labeling (DRNL).