Double-Radius Node Labeling (DRNL)

Updated 12 January 2026

DRNL is a distance-based, permutation-equivariant node labeling scheme that assigns integer labels based on dual shortest-path distances in an h-hop subgraph.
It allows GNNs to capture joint structural representations for link prediction, overcoming the limitations of single-node aggregation with a closed-form labeling function.
Empirical evaluations reveal that DRNL-based models outperform baseline methods by up to 15 percentage points on various benchmark datasets.

Double-Radius Node Labeling (DRNL) is a permutation-equivariant, distance-based node labeling scheme originally designed to enhance the expressiveness of graph neural networks (GNNs) for multi-node representation learning, specifically in link prediction tasks. DRNL assigns integer labels to each node in a subgraph centered on a target node pair, encoding each node's relative position with respect to both endpoints. This approach enables a GNN to learn joint structural representations of node sets, bypassing the limitations of single-node aggregation schemes.

1. Labeling Trick Framework and DRNL's Role

The “labeling trick” formalism provides a permutation-equivariant mechanism for distinguishing target nodes from others in multi-node tasks. For any node set $S \subseteq V$ , a labeling function %%%%1%%%% must satisfy:

Target-nodes-distinguishing: Any permutation that preserves the label assignments must send the target set $S'$ onto $S$ .
Permutation equivariance: Under re-indexing (i.e., node permutation), the label assignments reorder accordingly.

DRNL is a concrete instantiation for two-node sets (i.e., link prediction scenarios) (Zhang et al., 2020, Wang et al., 2023). It labels nodes in the $h$ -hop enclosing subgraph around each target link so that any automorphism preserving the link endpoints preserves the label structure. This labeling is instrumental in allowing a GNN to recover a most-expressive (structural) representation for links, as formalized in expressiveness theorems.

2. DRNL Label Assignment: Formal Definition

DRNL first extracts the $h$ -hop enclosing subgraph $G_{(u,v,h)}$ formed by nodes within shortest-path distance $h$ to either endpoint of the candidate link $(u,v)$ . For each node $x$ in this subgraph, DRNL assigns an integer label computed from shortest-path distances:

$d_u(x) = \mathrm{dist}(x,u)$ in the graph with $v$ removed.
$d_v(x) = \mathrm{dist}(x,v)$ in the graph with $u$ removed.
$d(x) = d_u(x) + d_v(x)$ ; $m(x) = \min(d_u(x), d_v(x))$ .

The closed-form labeling function is:

$y(x) = 1 + m(x) + \frac{d(x)(d(x)+1)}{2}$

Endpoints $u$ and $v$ receive label $1$. The function injectively enumerates $(d_u, d_v)$ tuples by increasing $d$ and by the lesser coordinate, ensuring that topologically equivalent positions yield the same label. Alternative formulae in SEAL (Wang et al., 2023) may use:

$\ell_{\rm DRNL}(v) = 1 + \min(d_x, d_y) + \left\lfloor \frac{d}{2} \right\rfloor \left( \left\lfloor \frac{d}{2} \right\rfloor + (d \bmod 2) - 1 \right)$

where $d_x$ and $d_y$ are as above.

3. Subgraph Extraction and Distance Computation

For each link, the $h$ -hop subgraph is induced. Distance computations utilize the “masking trick”: to obtain $d_u(x)$ , $v$ (and incident edges) are temporarily deleted; for $d_v(x)$ , $u$ is removed. Breadth-first search (BFS) runs from each endpoint yield the required distances for each node in the subgraph. This masking prevents shortcut paths through the other endpoint, ensuring precise topological encoding.

If either $d_u(x)$ or $d_v(x)$ is infinite (i.e., the node lies outside the enclosing subgraph), $y(x)$ is set to $0$. No tie-breaking is required due to the injective nature of the labeling function. DRNL requires two BFS traversals per link subgraph, giving algorithmic-time complexity $O(|V_H| + |E_H|)$ per candidate link, where $H$ denotes the subgraph.

4. Integration with GNN Workflows

After DRNL labeling, each node’s feature consists of its original attribute concatenated with a learnable embedding (or one-hot encoding) of the integer label $y(x)$ . GNN message-passing proceeds on the labeled subgraph, typically for $l > h$ layers to fully assimilate enclosing structure. Downstream, node representations for endpoints $u$ and $v$ are aggregated (e.g., by Hadamard product or concatenation followed by MLP) to yield the link representation. Subgraph-level readouts (such as SortPooling) may further supplement learning neighborhood context.

5. Expressiveness and Theoretical Guarantees

DRNL satisfies the formal requirements of the labeling trick—target-nodes-distinguishing and permutation equivariance—for two-node sets. Theorem 3.2 from (Zhang et al., 2020) and Theorem 4.2/5.1 from (Wang et al., 2023) rigorously establish that, given a node-most-expressive GNN and injective aggregator, any link labeled via DRNL can be mapped to its most-expressive structural representation. This ensures that non-isomorphic links are reliably distinguished and that isomorphic links yield identical representations.

Further, DRNL empowers simple 1-WL-GNNs augmented with zero-one labeling to learn topological heuristics (e.g., common neighbors, Adamic-Adar, resource allocation) unattainable with vanilla 1-WL-GNN architectures (Wang et al., 2023).

6. Comparative Performance and Empirical Evaluation

Empirical studies on small and large benchmarks demonstrate robust performance improvements for link prediction:

Dataset	GAE (auroc)	SEAL (DRNL)
USAir	89.04%	97.09%
NS	74.10%	97.71%
PB	90.87%	95.01%
Yeast	83.04%	97.20%
C.elegans	73.25%	86.54%

On large-scale OGB datasets:

Dataset	Metric	SEAL (DRNL)
ogbl-collab	Hits@50	54.7%
ogbl-citation2	MRR	87.7%
ogbl-ppa	Hits@100	48.8%

DRNL-based models outperform graph auto-encoder baselines by significant margins—often 10–15 percentage points—underscoring the necessity and impact of node-labeling tricks in expressive link prediction workflows.

7. Advantages, Limitations, and Practical Considerations

Advantages

Expressiveness: Theoretical guarantees ensure maximal representational power for link prediction, with DRNL plus a sufficiently powerful GNN discriminating all non-isomorphic links (Zhang et al., 2020).
Empirical Success: Consistent outperformance over baseline methods on both small and large benchmarks, matching or surpassing alternatives such as Distance Encoding (Wang et al., 2023).
Simplicity: The labeling function is closed-form, with only two shortest-path calculations per subgraph node.

Limitations

Computational Overhead: Subgraph extraction and dual BFS per link increase cost on very large or dense graphs.
Hyperparameter Sensitivity: Optimal choice of $h$ and $l$ is dataset-dependent, with $h$ often set to $1$ or $2$ and $l$ exceeding $h$ for best results.
Graph Density Sensitivity: DRNL may underperform free-node embedding methods on dense graphs (e.g., ogbl-ddi), possibly due to inductive complexity of dense neighborhood patterns.

A plausible implication is that while DRNL offers state-of-the-art performance for structurally heterogeneous graphs, practitioners may need to evaluate trade-offs on extremely large or dense graph instances.

Markdown Upgrade to Chat

References (2)

Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning (2020)

Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Double-Radius Node Labeling (DRNL).