Propagation Kernels (1410.3314v1)

Published 13 Oct 2014 in stat.ML and cs.LG

Abstract: We introduce propagation kernels, a general graph-kernel framework for efficiently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two benefits. First, off-the-shelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing efficient and informative propagation schemes, propagation kernels can be considerably faster than state-of-the-art approaches without sacrificing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of real-world graphs from a variety of application domains.

Citations (239)

View on Semantic Scholar

Summary

The paper introduces propagation kernels that integrate propagation schemes like random walks to capture detailed graph structure using node labels and attributes.
It employs locality-sensitive hashing to compress node distributions, reducing computational complexity for diverse graph types including partially labeled graphs.
Empirical results demonstrate that propagation kernels achieve competitive accuracy and faster processing across various applications from bioinformatics to computer vision.

An Overview of Propagation Kernels

The paper "Propagation Kernels," authored by Marion Neumann, Roman Garnett, Christian Bauckhage, and Kristian Kersting, introduces an advanced approach for measuring graph similarities within the domain of graph-kernel frameworks. The primary contribution of this research is the development and introduction of propagation kernels, which build on sophisticated propagation schemes such as random walks to capture and utilize structural information encoded in node labels and attributes, as well as edge information across varied graph types.

Propagation kernels present a holistic approach by leveraging early-stage distributions of propagation schemes. This offers two significant advantages: the ability to accommodate a wide range of graph types including labeled, partially labeled, directed, attributed as well as unlabeled graphs, and the potential for reduced computational complexity and time compared to existing state-of-the-art graph-kernel approaches without a compromise in predictive performance.

Key Contributions

The proposed method significantly expands upon traditional graph-kernel methods by incorporating widely used propagation schemes naturally, to compute kernels, which are not only adaptable to multiple types of graphs but also computationally efficient in handling large databases of graphs. Relying on existing propagation schemes like diffusion or label propagation, propagation kernels provide a method to integrate uncertain or partial information, such as partially labeled graphs, in a more natural manner compared to classical graph kernels that demand complete label or attribute information.

This research demonstrates that by utilizing nodes' label and/or attribute distributions in propagation schemes, an efficient framework can be achieved. They employ locality-sensitive hashing (LSH) to compress graph node distributions into easily computable vector forms. This enables rapid computation of graph features with possible scalability to extensive graph datasets. Importantly, the kernels also demonstrate how suitable propagation schemes facilitate partitioning a graph database into elegant segments that allow for rapid comparisons across potentially vast node distributions. This is a notable advantage over conventional methods like the Weisfeiler-Lehman (WL) kernel, especially in scenarios involving partially labeled data.

Numerical and Empirical Evidence

The authors present exhaustive experiments across a variety of applications, such as bioinformatics for protein structures, semantic image analysis, image-based plant disease classification, and 3D object predictions, underscoring the practicality of propagation kernels across datasets with differing complexity. Propagation kernels outperform or achieve competitive accuracy with existing graph-kernel approaches while substantially reducing processing time and computational resources — a crucial requirement for handling datasets with millions of nodes, such as textures or dense 3D point clouds used in robotic applications.

These empirical results are substantiated by experiments on commonly used bioinformatics datasets where the propagation kernels frequently achieve comparable, if not superior, classification accuracy compared to established methods. Moreover, propagation kernels adapt to novel application domains such as semantic image classification using grid graphs, directly extending graph-based machine learning applications into realms traditionally untapped by graph kernels.

Theoretical and Practical Implications

Theoretically, this work shifts the reliance from graph node labeling towards a more richly defined propagation scheme, paving the path for flexible, adaptable, and extendable graph kernels that cater to diverse graph types and datasets, scaling efficiently from molecular biology to computer vision and robotics. Practically, the framework demonstrates applications where other methods struggle, such as large-scale, partially labeled graphs and image data requiring pixel-level analysis.

Future Directions

Future developments can extend propagation kernels to blend additional machine learning tasks such as clustering or regression, potentially offering unified solutions across other complex graph-based problems. Furthermore, leveraging different propagation schemes or integrating unsupervised learning methods may enhance the adaptability and robustness of propagation kernels across widespread applications, providing further momentum to research in machine learning on graphs.

In conclusion, the paper lays a strong foundation for the advancement of graph kernels, suggesting a broad and versatile framework that significantly extends the horizons for graph similarity measures and sets a solid premise for future exploration in graph-structured data analysis.

PDF Markdown