Learning Convolutional Neural Networks for Graphs (1605.05273v4)

Published 17 May 2016 in cs.LG, cs.AI, and stat.ML

Abstract: Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.

Authors (3)

Mathias Niepert (85 papers)
Mohamed Ahmed (12 papers)
Konstantin Kutzkov (12 papers)

Citations (2,082)

View on Semantic Scholar

Summary

Learning Convolutional Neural Networks for Graphs

This paper introduces a compelling framework extending Convolutional Neural Networks (CNNs) for processing arbitrary graph structures. Traditional CNNs excel in handling grid-like data, such as images, by leveraging local spatial coherence and shared weights. However, many real-world problems involve non-Euclidean data represented as graphs with nodes and edges having both discrete and continuous attributes.

The paper addresses two core problems:

Learning functions for classification and regression on unseen graphs within a given graph collection.
Inferring properties such as node types or missing edges from a large graph representation.

The proposed solution is a systematic approach combining node sequence selection, neighborhood assembly, and graph normalization within a CNN architecture. The framework transforms arbitrary graphs into structured sequences of locally connected regions, analogous to image patches in CNNs, enabling effective feature learning and subsequent use in tasks such as classification and regression.

Methodological Contributions

Node Sequence Selection: The paper utilizes graph labeling procedures to impose an order on the nodes of a graph, thereby identifying a sequence of nodes for constructing receptive fields. Efficient breadth-first search is employed for neighborhood assembly around selected nodes, ensuring the include of relevant nodes up to a fixed size.
Graph Normalization: To address the lack of inherent spatial order in graphs, the approach leverages graph labeling techniques (e.g., Weisfeiler-Lehman algorithm) to normalize neighborhoods, providing a consistent vector representation for structurally similar nodes across different graphs. The normalization problem is formalized, showing its NP-hard nature, and heuristics for practical computation are discussed.
Convolutional Architecture: The resultant framework seamlessly integrates with existing CNN infrastructure, permitting the processing of both node and edge attributes. The architecture includes convolution layers tailored to handle graph data, ensuring efficiency and applicability to large graphs. This novel approach extends traditional CNN capabilities to non-Euclidean domains efficiently.

Experimental Results

The empirical evaluation demonstrates the framework's competitive performance against state-of-the-art graph kernels across multiple benchmark datasets, including chemical compounds and protein structures. Notably, the results highlight:

Improved accuracy in classification tasks for several datasets, sometimes significantly outperforming existing methods.
Theoretical and practical efficiency, with the framework generating receptive fields at a sufficient rate to keep up with CNN learning speeds.

For example, using the standard benchmark dataset MUTAG, Patchy-san (the proposed framework) achieved an accuracy of ~93% with receptive field size 10, outperforming existing methods such as the Weisfeiler-Lehman kernel.

Implications and Future Directions

The theoretical and empirical achievements of this framework open several avenues for further exploration and applications:

Scalability: The framework’s ability to handle large graph datasets efficiently positions it for extended use in domains like social networks and biological studies.
Network Variations: Future work could explore alternative neural architectures, like Recurrent Neural Networks (RNNs) or Graph Neural Networks (GNNs), and different convolution strategies, potentially enhancing performance further.
Feature Learning: Combining the framework with unsupervised feature learning methods (e.g., RBMs) and pretraining schemas could improve the learned representations' richness and generalizability.

Overall, this research establishes a robust foundation for applying deep learning techniques to graph-structured data, promising significant advancements in various scientific and practical domains relying on complex relational data.

PDF Markdown