Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels (1905.13192v2)

Published 30 May 2019 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to infinitely wide multi-layer GNNs trained by gradient descent. GNTKs enjoy the full expressive power of GNNs and inherit advantages of GKs. Theoretically, we show GNTKs provably learn a class of smooth functions on graphs. Empirically, we test GNTKs on graph classification datasets and show they achieve strong performance.

Citations (251)

View on Semantic Scholar

Summary

The paper presents GNTKs by merging the expressive power of GNNs with the tractable analysis of graph kernels.
It shows that for infinitely wide networks, the training dynamics align with deterministic kernel methods for efficient learning.
Empirical results demonstrate state-of-the-art performance on benchmarks, highlighting a scalable alternative for graph-based tasks.

A Detailed Analysis of Graph Neural Tangent Kernels

The paper "Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels" introduces a novel class of graph kernels named Graph Neural Tangent Kernels (GNTKs). The paper is authored by researchers from a collaboration of prestigious institutions including Carnegie Mellon University, Massachusetts Institute of Technology, and the Institute for Advanced Study. The work proposes an overview of Graph Neural Networks (GNNs) and traditional Graph Kernels (GKs), addressing limitations and strengths present in both methods.

Overview

Graph-structured data, prevalent in domains such as social networks and biological networks, requires advanced methods for effective learning. GKs, which are based on combinatorial features, provide theoretical guarantees and are easy to train due to their convex optimization nature. However, their ability to capture rich graph features is limited. Conversely, GNNs excel at capturing high-order graph features through their multi-layer architecture and non-linear transformations, offering practical performance advantages. Yet, GNNs suffer from complicated training dynamics due to non-convexity and demand substantial computational resources, challenging stability and interpretability.

GNTKs are introduced as a hybrid approach to overcome these issues by leveraging the strengths of both GNNs and GKs. The approach ties closely to the framework of neural tangent kernels, relating to infinitely wide neural networks under gradient descent training. In essence, GNTKs approximate the learning dynamics of GNNs while maintaining tractable training and theoretical benefits analogous to GKs.

Key Contributions

Integration of GNN and GK: The research innovatively integrates GNN architectures with GK methodologies. It leverages neural networks' expressive power, which scales with parameters, while ensuring the benefits of GKs, like ease of analysis and provable learning guarantees.
Kernel Equivalence of Infinitely Wide Networks: GNTKs are derived from the realization that for sufficiently wide and over-parameterized neural networks, training dynamics can align with those of kernel methods, where the kernel is deterministically defined by network architecture and data.
Architecture-Independent Recipe: The work provides a systematic translation mechanism from various GNN architectures to corresponding GNTKs, ensuring broad applicability across popular GNN variants like Graph Isomorphism Network (GIN) and Graph Convolutional Networks (GCN).
Theoretical Analysis and Empirical Evaluation: The paper conducts a theoretical analysis demonstrating that GNTKs can learn a wide class of smooth functions on graphs with a polynomial number of samples, an advancement in understanding sample complexity in the context of graph-based learning. Empirically, GNTKs show competitive performance on several benchmark datasets, achieving state-of-the-art results on select tasks while being computationally efficient relative to traditional GNNs.

Practical Implications and Future Directions

The introduction of GNTKs has significant implications for both theory and practice in graph-based machine learning. Theoretically, it advances understanding regarding the feasibility of learning specific function classes with graph data. Practically, it promises improved performance in tasks where graph structures are paramount by providing a scalable and interpretable alternative to conventional GNNs. Future research could explore extending the framework to dynamic or streaming graph data and optimizing computational efficiency further, potentially allowing deployment in resource-constrained environments.

Conclusion

GNTKs represent a pivotal advancement fusing the robustness of GKs with the expressive computational paradigms of GNNs. As the landscape of graph-based models continues to evolve, the methods proposed could pave the way for more effective, scalable, and theoretically grounded approaches to graph learning problems, offering a valuable toolset for researchers and practitioners alike exploring complex graph-centric domains.

PDF Markdown