Invariant Graph Transformer (2312.07859v2)

Published 13 Dec 2023 in cs.LG and cs.SI

Abstract: Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In graph machine learning context, graph rationale is defined to locate the critical subgraph in the given graph topology, which fundamentally determines the prediction results. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgraph. Graph rationalization can enhance the model performance as the mapping between the graph rationale and prediction label is viewed as invariant, by assumption. To ensure the discriminative power of the extracted rationale subgraphs, a key technique named "intervention" is applied. The core idea of intervention is that given any changing environment subgraphs, the semantics from the rationale subgraph is invariant, which guarantees the correct prediction result. However, most, if not all, of the existing rationalization works on graph data develop their intervention strategies on the graph level, which is coarse-grained. In this paper, we propose well-tailored intervention strategies on graph data. Our idea is driven by the development of Transformer models, whose self-attention module provides rich interactions between input nodes. Based on the self-attention module, our proposed invariant graph Transformer (IGT) can achieve fine-grained, more specifically, node-level and virtual node-level intervention. Our comprehensive experiments involve 7 real-world datasets, and the proposed IGT shows significant performance advantages compared to 13 baseline methods.

References (37)

Summary

The paper introduces IGT, a novel approach for fine-grained graph rationalization using node- and virtual node-level interventions through Transformer self-attention.
It employs an encoder, augmenter, intervener, and predictor in a synergistic architecture to identify informative subgraphs while ensuring robustness across varying environments.
Experiments on 7 real-world datasets show that IGT consistently outperforms or matches 13 baseline methods, demonstrating enhanced predictive accuracy and interpretability.

Introduction to Invariant Graph Transformer

Graphs are an immensely useful data structure, prevalently used to model relationships and interactions in various fields such as chemistry, social networks, and biology. A critical aspect of graph machine learning is to identify substructures within graphs, termed "graph rationales", that are most informative for the predictions of particular tasks. Graph rationales can enhance model performance and improve model explainability by capturing the most relevant features within a complex network.

Methodology

The proposed method introduces Invariant Graph Transformer (IGT), a novel architecture aiming at fine-grained graph rationalization. Unlike existing methods that intervene on a graph level, IGT operates on a more precise node-level or virtual node-level, leveraging the power of the self-attention mechanism found in Transformer models. Essentially, IGT is composed of an encoder, augmenter, intervener, and predictor, working in synergy to discover and exploit the pivotal subgraph, while ensuring its predictive robustness against the backdrop of varying environments.

Experimental Results

IGT has been rigorously tested across 7 real-world datasets, comparing its performance against 13 baseline methods. The experiments demonstrate that both node-level (IGT-N) and virtual node-level (IGT-VN) variants of IGT consistently outperform or match the competing methods. This indicates that IGT's approach to fine-grained intervention, combined with its invariant learning process, is highly effective in graph rationalization tasks.

Conclusion

The research introduces a new perspective to the graph rationale discovery problem, proposing a Transformer-inspired model that intervenes at a granular level. IGT not only identifies crucial subgraphs more effectively than coarse-grained approaches but also maintains their utility across variable conditions, resulting in an impressive performance. The paper's findings lay the groundwork for future research in optimizing graph-learning models for both predictive accuracy and interpretability.

PDF Markdown

Tweets

https://twitter.com/22146921/status/1735426041173451031