On Explainability of Graph Neural Networks via Subgraph Explorations (2102.05152v2)

Published 9 Feb 2021 in cs.LG and cs.AI

Abstract: We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

Citations (340)

View on Semantic Scholar

Summary

The paper introduces SubgraphX, which uses Monte Carlo Tree Search to explore subgraph spaces and reveal hidden decision patterns in GNNs.
It employs Shapley values for fair importance scoring, ensuring that subgraph contributions accurately reflect their influence on predictions.
Experiments show that SubgraphX delivers higher fidelity and interpretability compared to traditional node-edge explanation methods.

Explainability of Graph Neural Networks via SubgraphX: An Analytical Study

The paper "On Explainability of Graph Neural Networks via Subgraph Explorations" by Hao Yuan et al. addresses a significant challenge in the field of Graph Neural Networks (GNNs): their inherent lack of interpretability. As GNNs are increasingly applied across graph-related tasks such as graph classification and node prediction, understanding the logic behind their decisions becomes imperative, especially in domains requiring transparency and trustworthiness.

Summary of the Proposed Methodology

The authors introduce SubgraphX, a novel methodology aimed at uncovering the underlying decision-making process of GNNs by focusing on graph substructures instead of traditional emphasis on nodes or edges. This distinction is pivotal because subgraphs are argued to be more intuitive and align more closely with human reasoning in interpreting complex networks.

Subgraph Exploration with Monte Carlo Tree Search (MCTS): The framework employs MCTS to systematically explore subgraphs within a target graph. Each node in the search tree represents a potential subgraph, where the edges correspond to node-pruning actions. By employing a structured search strategy, SubgraphX efficiently navigates the extensive landscape of possible subgraphs, which would otherwise be computationally prohibitive.

Shapley Values for Importance Scoring: Importantly, SubgraphX utilizes Shapley values to quantify the importance of these subgraphs. Shapley values consider the contribution of each graph component to the overall prediction, factoring in interactions among different subgraph components. This cooperative game-theoretic approach offers a balanced and fair measure, capturing not just isolated influence but the synergistic effects of components in aggregate.

Computational Efficiency: Recognizing the computational demands of calculating exact Shapley values, the authors propose approximation schemes that limit calculations to a node's local neighborhood, determined by the receptive field of the GNN layers. This refined scope significantly improves efficiency, rendering the method practical for real-world applications.

Experimental Validation and Insights

Across a suite of both synthetic and real-world datasets (e.g., MUTAG, BBBP, Graph-SST2), the SubgraphX framework demonstrates superior fidelity and human-intelligibility in GNN explanations compared to existing node-edge focused methods like GNNExplainer and PGExplainer. These experimental findings underscore SubgraphX's efficacy in providing more concise and meaningful subgraph-level insights.

The results are further quantified by comparing the fidelity (accuracy drop when important subgraph features are occluded) and sparsity (fraction of nodes identified as important) metrics. SubgraphX consistently outperforms comparators, offering explanations that both align with known structural motifs and yield significant drops in prediction fidelity when ablated.

Implications and Future Directions

The introduction of subgraph-centered explanations marks a significant theoretical advancement, offering a fresh perspective on GNN interpretability. By incorporating subgraph-level insights, practitioners have a more robust tool for analyzing model predictions, particularly in fields such as cheminformatics and social network analysis, where the functional significance of specific graph motifs is well-understood.

Future work may delve into several enriching avenues such as refining the computational strategies to enhance scaling further, integrating domain-specific heuristics to tailor subgraph discovery, and exploring dynamic graph scenarios where graph structures evolve over time. Additionally, the robustness of SubgraphX's explanations in adversarial settings, where model predictions might be intentionally misled by altered graph structures, remains an intriguing area for exploration.

In conclusion, SubgraphX provides a foundational step towards deciphering the black-box nature of GNNs through the lens of graph substructures, aiming for explanations that are not only quantitatively satisfactory but also align with qualitative domain insights.

PDF Markdown