Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching (1905.07645v5)

Published 18 May 2019 in cs.LG, cs.SI, and stat.ML

Abstract: We propose a scalable Gromov-Wasserstein learning (S-GWL) method and establish a novel and theoretically-supported paradigm for large-scale graph analysis. The proposed method is based on the fact that Gromov-Wasserstein discrepancy is a pseudometric on graphs. Given two graphs, the optimal transport associated with their Gromov-Wasserstein discrepancy provides the correspondence between their nodes and achieves graph matching. When one of the graphs has isolated but self-connected nodes ($i.e.$, a disconnected graph), the optimal transport indicates the clustering structure of the other graph and achieves graph partitioning. Using this concept, we extend our method to multi-graph partitioning and matching by learning a Gromov-Wasserstein barycenter graph for multiple observed graphs; the barycenter graph plays the role of the disconnected graph, and since it is learned, so is the clustering. Our method combines a recursive $K$-partition mechanism with a regularized proximal gradient algorithm, whose time complexity is $\mathcal{O}(K(E+V)\log_K V)$ for graphs with $V$ nodes and $E$ edges. To our knowledge, our method is the first attempt to make Gromov-Wasserstein discrepancy applicable to large-scale graph analysis and unify graph partitioning and matching into the same framework. It outperforms state-of-the-art graph partitioning and matching methods, achieving a trade-off between accuracy and efficiency.

Citations (182)

View on Semantic Scholar

Summary

The paper introduces the S-GWL framework that extends Gromov-Wasserstein discrepancy to unify graph partitioning and matching.
It employs a recursive K-partition strategy with a regularized proximal gradient algorithm to achieve scalability with O(K(E+V) logK V) time complexity.
Experimental results demonstrate improved node matching accuracy and noise resilience, outperforming existing state-of-the-art graph analysis methods.

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching: An Expert Overview

The paper introduces a novel framework, Scalable Gromov-Wasserstein Learning (S-GWL), developed to extend the application of Gromov-Wasserstein discrepancy to large-scale graph analysis, specifically focusing on graph partitioning and matching. This approach represents the first known effort to effectively leverage the Gromov-Wasserstein discrepancy for scalable graph-related tasks, unifying both partitioning and matching within a single framework.

Gromov-Wasserstein distance, originally designed for metric-measure spaces, effectively measures relational distances between distributions, which can be extended into what is termed Gromov-Wasserstein discrepancy for graphs. This extension provides a pseudometric that establishes correspondences between graph nodes, allowing for both graph matching and partitioning operations. The S-GWL method leverages this discrepancy by using optimal transport to determine node correspondences and clustering structures in graphs, expanding these operations to include multi-graph analyses.

Methodological Framework

The S-GWL method uses a recursive $K$ -partition mechanism combined with a regularized proximal gradient algorithm, boasting a time complexity of $\mathcal{O}(K(E+V) \log_K V)$ for graphs with $V$ nodes and $E$ edges. This complexity facilitates the analysis of large-scale graphs, a notable improvement over existing methods.

The method establishes a Gromov-Wasserstein barycenter for multiple graphs, which serves as a reference point, facilitating both graph matching and partitioning. Through recursive applications, the framework decomposes large graphs into aligned sub-graphs, allowing for efficient and accurate operations.

Numerical Results and Implications

The S-GWL method outperforms state-of-the-art alternatives in various scenarios, striking a balance between accuracy and computational efficiency. Specifically, it demonstrates superior node correctness in graph matching tasks and exhibits remarkable resilience against noise in graph partitioning tasks. Its framework unifies multiple graph-related tasks, providing a solid basis for further research and practical applications in network alignment, community detection, and beyond.

Future Directions

Future research could explore adaptive setting of hyperparameters and further enhancements through parallel processing or distributed systems, which could offer additional efficiencies. Additionally, addressing the limitations posed by unmatched scalability between graphs and incorporating node-level features could further extend the method's applicability, particularly in bioinformatics for tasks like protein-protein interaction network alignment.

In conclusion, the S-GWL framework represents a significant advancement in graph analysis, offering a robust, unified approach for large-scale graph partitioning and matching. As the field progresses, this method may serve as a crucial foundation for subsequent innovations in scalable graph analysis methodologies.

PDF Markdown