FlowX: Towards Explainable Graph Neural Networks via Message Flows (2206.12987v3)

Published 26 Jun 2022 in cs.LG and cs.AI

Abstract: We investigate the explainability of graph neural networks (GNNs) as a step toward elucidating their working mechanisms. While most current methods focus on explaining graph nodes, edges, or features, we argue that, as the inherent functional mechanism of GNNs, message flows are more natural for performing explainability. To this end, we propose a novel method here, known as FlowX, to explain GNNs by identifying important message flows. To quantify the importance of flows, we propose to follow the philosophy of Shapley values from cooperative game theory. To tackle the complexity of computing all coalitions' marginal contributions, we propose a flow sampling scheme to compute Shapley value approximations as initial assessments of further training. We then propose an information-controlled learning algorithm to train flow scores toward diverse explanation targets: necessary or sufficient explanations. Experimental studies on both synthetic and real-world datasets demonstrate that our proposed FlowX and its variants lead to improved explainability of GNNs. The code is available at https://github.com/divelab/DIG.

Citations (9)

View on Semantic Scholar

Summary

The paper introduces a novel explanation method that shifts focus from nodes and edges to message flows, offering a natural lens for understanding GNN operations.
It employs Shapley value approximations and a flow sampling algorithm to efficiently quantify the importance of message flows while managing computational complexity.
Experimental results show that FlowX outperforms conventional methods on synthetic and real-world datasets, significantly improving the interpretability of GNNs.

FlowX: Advancements in Explainable Graph Neural Networks

This paper presents FlowX, a novel approach for enhancing the explainability of Graph Neural Networks (GNNs) by concentrating on message flows, instead of traditional node, edge, or subgraph explanations. The authors propose that message flows, integral to the functionality of GNNs, serve as a more natural medium for understanding GNN operations. To quantify the importance of these flows, FlowX employs Shapley values from cooperative game theory, introducing an innovative flow sampling method to tackle computational complexity.

Key Contributions

Message Flow-Based Explanation: FlowX uniquely frames GNN operations as functions of message flows traversing through a network's layers. By prioritizing these flows, FlowX differentiates itself from existing methods that primarily consider lower-level components like nodes or edges.
Shapley Value Approximations: To assess the contribution of each flow, FlowX adopts the principle of Shapley values, offering a fair allocation of importance. Given the challenge of directly computing marginal contributions for all coalitions, a novel flow sampling algorithm is proposed. This sampling ensures fair representation by randomly permuting message carriers, systematically computing contributions without exhaustive enumeration.
Learning-Based Refinement: The model introduces a learning algorithm that leverages initial Shapley approximations to train flow scores. This training accommodates diverse targets—necessary versus sufficient explanations—paving the way for tailored interpretability based on specific academic or practical needs.

Experimental Evaluation

FlowX's empirical evaluations demonstrate significant enhancements in explainability across various synthetic and real-world datasets. Key metrics such as Fidelity+ and Fidelity- are employed to assess the faithfulness and sufficiency of explanations. FlowX consistently outperforms existing methods like GNNExplainer and PGExplainer, showing improved ability to identify and utilize critical message flows, especially in complex scenarios requiring multi-hop consideration.

Synthetic Data: On datasets like BA-Infection and BA-Traffic, FlowX's flow-based insights tackle intrinsic challenges where traditional edge-based methods falter. These datasets emphasize FlowX's strength in modeling comprehensive, multi-step dependencies crucial for accurate explanation.
Real-World Applications: For chemical and text-based datasets, FlowX achieves superior performance in interpretability without sacrificing efficiency. Its ability to produce more human-intelligible explanations with higher Sparsity levels showcases its practical relevance.

Implications and Future Directions

FlowX represents a meaningful step forward in elucidating GNN mechanisms. Its methodology dispels some opacity shrouding deep models, aligning with the growing demand for AI systems that are both effective and transparent. By focusing on high-level message flows, FlowX not only enhances current understanding but also sets the stage for future directions including:

Broader Causal Frameworks: The integration of causal models within GNNs could further enrich FlowX’s approach, accommodating more complex dependencies and external variables.
Application Expansion: Extending FlowX’s framework to other neural architectures, such as those used in molecular dynamics or social network analysis, could yield insights across diverse domains.
Real-Time Interpretability: Development of real-time or near-real-time flow-based explanation systems for dynamic graph data, offering broader usability in fast-paced environments.

FlowX's innovative perspective and robust empirical backing position it as a key tool in the ongoing pursuit of transparent and interpretable AI systems. The paper’s contributions are instrumental in highlighting the potential of message flows in decoding the intricate processes of GNNs, paving avenues for research and application enhancements in explainable AI.