SPAGAN: Shortest Path Graph Attention Network (2101.03464v1)

Published 10 Jan 2021 in cs.LG and cs.AI

Abstract: Graph convolutional networks (GCN) have recently demonstrated their potential in analyzing non-grid structure data that can be represented as graphs. The core idea is to encode the local topology of a graph, via convolutions, into the feature of a center node. In this paper, we propose a novel GCN model, which we term as Shortest Path Graph Attention Network (SPAGAN). Unlike conventional GCN models that carry out node-based attentions within each layer, the proposed SPAGAN conducts path-based attention that explicitly accounts for the influence of a sequence of nodes yielding the minimum cost, or shortest path, between the center node and its higher-order neighbors. SPAGAN therefore allows for a more informative and intact exploration of the graph structure and further {a} more effective aggregation of information from distant neighbors into the center node, as compared to node-based GCN methods. We test SPAGAN on the downstream classification task on several standard datasets, and achieve performances superior to the state of the art. Code is publicly available at https://github.com/ihollywhy/SPAGAN.

Citations (92)

View on Semantic Scholar

Summary

Overview of SPAGAN: Shortest Path Graph Attention Network

The paper presents a novel approach to Graph Convolutional Networks (GCNs) designed to enhance the representation and aggregation of information in graph structures. The proposed method, titled Shortest Path Graph Attention Network (SPAGAN), diverges from conventional node-based attentions found in existing GCN models. SPAGAN introduces path-based attentions, leveraging the shortest paths between nodes for improved data aggregation and topological exploration in graphs.

Motivation and Methodology

Graph Convolutional Networks have shown promise in handling non-grid data structures, enabling effective feature learning from graph topology. Traditional approaches, such as those by Kipf and Welling (GCN) and Velickovic et al. (Graph Attention Networks, GAT), primarily use node-based attention within individual layers, focusing on first-order neighbors. Although stacking multiple layers allows exploration of higher-order neighbors, this often leads to issues like convergence difficulties and performance degradation.

To counteract these limitations, SPAGAN utilizes shortest-path-based attention, allowing for robust extraction of information from higher-order neighbors without requiring multiple stacked layers. By considering sequences of nodes connected via shortest paths, SPAGAN facilitates comprehensive exploration of graph topology and effective feature embedding.

Key Components

Shortest Path Generation: The approach employs Dijkstra's algorithm to compute shortest paths between nodes. Unlike node-based attentions which focus solely on immediate neighbors, SPAGAN considers multiple paths to integrate more distant node information effectively.
Path Sampling: To reduce computation and highlight relevant paths, SPAGAN samples paths with low-cost metrics. The number of sampled paths is proportional to node degree, ensuring efficient representation across node connections.
Hierarchical Path Aggregation: SPAGAN features a two-step aggregation mechanism. First, it aggregates features within paths of the same length and computes attention coefficients. Secondly, it consolidates features from paths of varying lengths to determine the node's updated feature embedding. This hierarchical approach enables comprehensive feature extraction while integrating multi-length path attentions.

Experimental Results and Implications

SPAGAN was tested on prevalent semi-supervised datasets: Cora, Citeseer, and Pubmed, demonstrating superior classification accuracy compared to other state-of-the-art methods like MLP, DeepWalk, GAT, and GeniePath. The increases in accuracy across the datasets highlight SPAGAN's effectiveness in encoding complex graph topologies and its advantage over simple first-order methods.

The introduction of SPAGAN has significant implications for both theoretical and practical developments in AI, particularly in applications requiring sophisticated graph data parsing, such as social network analysis and chemical informatics. Future work might explore optimizations for path sampling ratios, path lengths, and iterative training processes to further enhance SPAGAN's performance and scalability across larger graph datasets.

Conclusion

SPAGAN represents a robust advancement in graph attention mechanisms, leveraging shortest path attentions to effectively capture high-order neighbor information within a single layer. The method holds promise for improving graph-based learning models and could serve as a foundation for future research in enhancing computational methods related to graph convolutional networks.