NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification (2306.08385v1)

Published 14 Jun 2023 in cs.LG and cs.AI

Abstract: Graph neural networks have been extensively studied for learning with inter-connected data. Despite this, recent evidence has revealed GNNs' deficiencies related to over-squashing, heterophily, handling long-range dependencies, edge incompleteness and particularly, the absence of graphs altogether. While a plausible solution is to learn new adaptive topology for message passing, issues concerning quadratic complexity hinder simultaneous guarantees for scalability and precision in large networks. In this paper, we introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes, as an important building block for a pioneering Transformer-style network for node classification on large graphs, dubbed as \textsc{NodeFormer}. Specifically, the efficient computation is enabled by a kernerlized Gumbel-Softmax operator that reduces the algorithmic complexity to linearity w.r.t. node numbers for learning latent graph structures from large, potentially fully-connected graphs in a differentiable manner. We also provide accompanying theory as justification for our design. Extensive experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs (with up to 2M nodes) and graph-enhanced applications (e.g., image classification) where input graphs are missing.

References (54)

Authors (5)

Qitian Wu (29 papers)
Wentao Zhao (20 papers)
Zenan Li (22 papers)
David Wipf (59 papers)
Junchi Yan (241 papers)

Citations (169)

View on Semantic Scholar

Summary

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

The paper "NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification" introduces an innovative architecture designed to address fundamental challenges in Graph Neural Networks (GNNs), particularly concerning scalability and efficiency in handling large graphs. The document provides a detailed exposition of NodeFormer, a framework that enhances node classification by learning latent graph structures beyond explicit input topology.

Key Contributions

NodeFormer is constructed on the premise that traditional GNNs face several limitations, including over-squashing, difficulties with heterophily, long-range dependencies, and edge incompleteness. These issues are exacerbated when dealing with large graphs where the network structure might not be available. To mitigate these limitations, the paper proposes a new all-pair message passing paradigm that incorporates a kernelized Gumbel-Softmax operator. This operator significantly reduces the complexity associated with learning graph structures, lowering it from quadratic to linear concerning node numbers. This allows NodeFormer to efficiently scale to large datasets with millions of nodes.

Technical Insights

NodeFormer uses an efficient mechanism for latent structure learning which involves:

Kernelized Gumbel-Softmax Operator: This operator combines positive random features with an approximated sampling strategy, enabling differentiable optimization and efficient message passing by avoiding the computation of cumbersome all-pair similarity matrices.
Layer-wise Message Passing: Instead of relying on a fixed graph structure across all layers, NodeFormer learns latent graphs independently for each layer.
Relational Bias and Edge-Level Regularization: Leveraging available input graph structures, NodeFormer incorporates relational biases to reinforce weights on observed edges and uses edge-level regularization to ensure robustness against input graph incompleteness.

Empirical Results

The paper delivers extensive evaluations across various datasets, including node classification benchmarks and image/text classification tasks. NodeFormer demonstrates superior performance in environments characterized by homophily and heterophily, managing to outperform strong baseline GNN models and state-of-the-art structure learning approaches. Remarkably, NodeFormer effectively scales to large graphs with up to 2 million nodes, showcasing up to 93.1% reduction in time complexity and 80.6% reduction in space consumption compared to previous methods.

Implications and Future Directions

The introduction of NodeFormer implies substantial improvements for practical AI systems that operate on graph-structured data. It can be particularly beneficial in applications requiring robust node representations in large-scale networks, such as social and biological domains. Furthermore, the paper opens avenues for future exploration in applying NodeFormer architecture to other graph-related tasks like link prediction and graph regression. The model’s scalability and efficiency offer promising potential for integrating such framework in diverse scientific and industrial applications.

This paper pushes the boundaries of GNN scalability, presenting a model that could redefine approaches to handling vast, intricately connected data systems.

PDF Markdown