MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding (2002.01680v2)

Published 5 Feb 2020 in cs.SI and cs.LG

Abstract: A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.

View on arXiv

Authors (4)

Xinyu Fu (11 papers)
Jiani Zhang (21 papers)
Ziqiao Meng (12 papers)
Irwin King (170 papers)

Citations (745)

View on Semantic Scholar

Summary

Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

The paper "MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding" presents a sophisticated approach for embedding heterogeneous graphs. Such graphs are abundant in real-world applications, characterized by diverse node types and relations. The primary goal is to encode complex structures and semantic information into low-dimensional node vectors.

Methodology Overview

MAGNN introduces a novel framework to address limitations commonly found in existing heterogeneous graph embedding methods. These limitations include omitting node content features, ignoring intermediate nodes in metapaths, and relying on a single metapath for embedding. MAGNN employs three core components to overcome these challenges:

Node Content Transformation: This component ensures that node attributes, which may have varying dimensions or lie in different feature spaces, are projected into a shared latent vector space. This transformation facilitates subsequent aggregation processes.
Intra-metapath Aggregation: This step involves encoding information from metapath instances. Unlike prior models that consider only end nodes, MAGNN aggregates data from intermediate nodes along a metapath using an attention mechanism, which helps in capturing elaborate structural and semantic details.
Inter-metapath Aggregation: Recognizing that a single metapath can be insufficient, MAGNN aggregates multiple metapaths using attention weights to assign importance and fuse information from various metapaths, ensuring a comprehensive representation.

Experimental Evaluation

The effectiveness of MAGNN is validated through extensive experiments on three datasets: IMDb, DBLP, and Last.fm, covering tasks like node classification, clustering, and link prediction. Notable findings include:

Node Classification and Clustering: On IMDb and DBLP datasets, MAGNN consistently outperforms traditional models (e.g., LINE, node2vec) and recent heterogeneous GNN models (e.g., HAN) across various training sizes.
Link Prediction: It demonstrates superior capability in predicting links in the Last.fm dataset, surpassing state-of-the-art models like metapath2vec and HAN by significant margins, attributable to leveraging multiple metapaths and intermediate nodes.

Implications and Future Work

MAGNN's architecture presents several advantages for theoretical and practical applications in heterogeneous graph analysis. By incorporating detailed node content and comprehensive metapath information into the embedding process, the model promises improvements in tasks requiring understanding of complex node interactions and relations.

The implications are vast, including advancements in social network analysis, recommendation systems, and more. Future work could explore adapting MAGNN for additional tasks such as rating prediction or integrating it with knowledge graphs to enhance its predictive capabilities.

MAGNN's framework underscores the importance of node attributes and metapath contexts, offering a robust and flexible tool for researchers dealing with non-Euclidean data structures in heterogeneous graphs.

PDF Markdown

Related Papers

Find Related Papers