GRPE: Relative Positional Encoding for Graph Transformer (2201.12787v3)

Published 30 Jan 2022 in cs.LG and cs.AI

Abstract: We propose a novel positional encoding for learning graph on Transformer architecture. Existing approaches either linearize a graph to encode absolute position in the sequence of nodes, or encode relative position with another node using bias terms. The former loses preciseness of relative position from linearization, while the latter loses a tight integration of node-edge and node-topology interaction. To overcome the weakness of the previous approaches, our method encodes a graph without linearization and considers both node-topology and node-edge interaction. We name our method Graph Relative Positional Encoding dedicated to graph representation learning. Experiments conducted on various graph datasets show that the proposed method outperforms previous approaches significantly. Our code is publicly available at https://github.com/lenscloth/GRPE.

PDF Abstract

GRPE: Relative Positional Encoding for Graph Transformer

The paper introduces a novel approach to positional encoding for graph representation learning within the Transformer framework, named Graph Relative Positional Encoding (GRPE). The primary problem addressed is the challenge of encoding graph structures in a manner that preserves relative positional information without losing precision or integration with node-edge and node-topology interactions.

Background and Previous Approaches

Traditional Transformers require explicit positional encoding to manage structured data, particularly because they are permutation equivariant. In contexts such as natural language processing and computer vision, absolute positional encoding is straightforward—each input has a distinct position, such as words in a sentence or pixels in an image. However, graphs pose a greater challenge as nodes lack inherent order or position. Previous methodologies attempted to resolve this by either linearizing graphs to encode absolute node positions or employing bias terms to encode relative positions between node pairs. Unfortunately, these methods exhibit limitations: linearization can reduce positional precision, and relative encoding with bias terms often fails to capture intricate node-edge and node-topology interactions.

Proposal of GRPE

GRPE circumvents these limitations by directly encoding graph structures without linearization and by incorporating both node-topology and node-edge interactions into the Transformer model. This approach introduces two sets of learnable positional encoding vectors:

Topology Encoding: This encodes the topological relationships, such as shortest path distances between nodes, into the query, key, and value representations within the Transformer architecture.
Edge Encoding: This encodes the connections between nodes to capture diverse edge types into the Transformer.

By integrating these encodings into both the attention map and values, the model effectively learns complex node relationships and interactions present in graph structures.

Methodology

The GRPE method employs node-aware attention, which considers node features alongside topological relations and edge connections when computing attention scores. Furthermore, the method encodes graph information directly into the hidden representations of the model's values, enhancing the positional encoding's richness and ensuring the model accommodates graph-specific properties beyond simple 1D sequences. Such integration optimizes how the Transformer interprets relationships within graph data.

Experimental Results

The authors validate GRPE across multiple graph learning tasks, including graph classification, graph regression, and node classification. The results demonstrate superior performance compared to traditional methods, as GRPE achieves state-of-the-art scores on benchmark datasets such as ZINC, MolHIV, MolPCBA, PATTERN, CLUSTER, and large-scale datasets like PCQM4Mv2. Notably, the improvement in accuracy strongly suggests the efficacy and value of directly encoding graph structures into Transformer models without losing relative positional information.

Implications and Future Directions

The introduction of GRPE holds significant potential for advancing graph-based learning applications, particularly in domains requiring precise node and edge interactions, such as bioinformatics and social network analysis. The methodology provides a framework for subsequent exploration concerning the scalability and applicability of Transformers to diverse graph-based problems. Future developments may explore integrating GRPE within multi-modal models or expanding its application to more complex graph types, furthering the scope and impact of Transformer models in graph representation learning.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Wonpyo Park (14 papers)
Woonggi Chang (1 paper)
Donggeon Lee (9 papers)
Juntae Kim (13 papers)
Seung-won Hwang (59 papers)

Citations (62)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - lenscloth/GRPE: Official Implementation of "GRPE: Relative Positional Encoding for Graph Transformer" (58 stars)