GraKeL: A Graph Kernel Library in Python (1806.02193v2)

Published 6 Jun 2018 in stat.ML and cs.LG

Abstract: The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Graph kernels have recently emerged as a promising approach to this problem. There are now many kernels, each focusing on different structural aspects of graphs. Here, we present GraKeL, a library that unifies several graph kernels into a common framework. The library is written in Python and adheres to the scikit-learn interface. It is simple to use and can be naturally combined with scikit-learn's modules to build a complete machine learning pipeline for tasks such as graph classification and clustering. The code is BSD licensed and is available at: https://github.com/ysig/GraKeL .

Citations (147)

View on Semantic Scholar

Summary

The paper introduces GraKeL as a unified Python library that consolidates 15 graph kernels within a standard scikit-learn interface.
It employs optimized Python tools, including NumPy, SciPy, and Cython, to enhance the efficiency of graph similarity computations.
Benchmarking on datasets like MUTAG demonstrates GraKeL’s superior performance and streamlined integration in machine learning workflows.

GraKeL: A Comprehensive Graph Kernel Library in Python

The paper introduces GraKeL, a Python library designed to facilitate the implementation and usage of various graph kernels. As graph-structured data becomes increasingly prevalent across disciplines such as social network analysis and bioinformatics, measuring graph similarity remains crucial. GraKeL provides a unified platform for graph kernels, addressing this core requirement effectively.

Background and Objective

Graph kernels are critical for applying kernel methods directly to graphs by serving as similarity measures within a Hilbert space. The diversity of graph kernels stems from their focus on different structural aspects of graphs. GraKeL offers a cohesive framework that integrates multiple state-of-the-art graph kernels, enabling their use within machine learning pipelines through compatibility with the scikit-learn interface.

Technical Contribution

GraKeL leverages several packages within the Python ecosystem to provide efficient computational capabilities:

NumPy and SciPy for fundamental data structures and operations.
Cython to optimize performance by incorporating C code within Python.
BLISS for graph isomorphism checks.
CVXOPT (optional) for handling specific optimization tasks.

The library strictly adheres to an object-oriented design, with graph kernels inheriting from a base Kernel class. Methods like fit, fit_transform, transform, and diagonal unify the interface for kernel operations, streamlining integration within machine learning workflows.

Comparison with Existing Solutions

Unlike previous disjointed implementations, GraKeL organizes kernels within a common, user-friendly framework, distinguishing itself through:

Breadth: Implementation of 15 kernels and 2 frameworks, significantly more than competitors like the graphkernels library.
Integration: Seamless compatibility with scikit-learn enhances accessibility for machine learning tasks.
Efficiency: Despite being Python-based, GraKeL demonstrates superior performance in several instances, illustrated by benchmarking results on datasets like ENZYMES.

Practical Implications and Future Directions

GraKeL's design facilitates robust graph mining and classification tasks, as demonstrated by its usage example involving the MUTAG dataset with an SVM classifier achieving notable accuracy. The library's structured environment encourages further development and evaluation of novel kernels, providing a valuable resource for researchers.

The continuous growth of graph-structured data underlines the importance of developing efficient similarity measures. Implementations like GraKeL will likely play a pivotal role in evolving AI paradigms, enhancing the interpretability and applicability of graph-based learning models.

In summary, GraKeL stands as a significant contribution to the domain of graph kernel methods, offering a comprehensive, efficient, and user-friendly solution that aligns with modern machine learning practices.

PDF Markdown

Related Papers

GitHub

GitHub - ysig/GraKeL: A scikit-learn compatible library for graph kernels (603 stars)