Learning Deep Generative Models of Graphs (1803.03324v1)

Published 8 Mar 2018 in cs.LG and stat.ML

Abstract: Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry. Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes. Our approach uses graph neural networks to express probabilistic dependencies among a graph's nodes and edges, and can, in principle, learn distributions over any arbitrary graph. In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data. Compared to baselines that do not use graph-structured representations, our models often perform far better. We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions. Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.

Authors (5)

Yujia Li (54 papers)
Oriol Vinyals (116 papers)
Chris Dyer (91 papers)
Razvan Pascanu (138 papers)
Peter Battaglia (40 papers)

Citations (633)

View on Semantic Scholar

Summary

Learning Deep Generative Models of Graphs

The paper "Learning Deep Generative Models of Graphs" introduces a novel approach for constructing generative models specifically tailored for graphs, employing graph neural networks (GNNs) to capture probabilistic dependencies across nodes and edges. This research bypasses the limitations of traditional vector- and sequence-based models, advocating for a shift towards more flexible relational structures.

Introduction and Motivation

Graphs inherently represent various complex systems in areas such as social networks, molecular chemistry, and knowledge graphs. Traditional models, like random graph models and graph grammars, have limitations due to their rigid structural assumptions. This work presents an innovative model that operates on arbitrary graphs without such constraints, offering more expressiveness while maintaining robustness.

Methodology

The authors propose a sequential graph generation process that incrementally constructs graphs by making decisions on node and edge additions. This process involves three primary steps:

Node Generation: Decide whether to add a node (and of what type) or terminate.
Edge Addition: Determine if and where to add an edge connecting the new node to the existing graph.
Node Selection: Choose a node from the existing set to connect to the new node.

These steps leverage graph neural networks to update node embeddings and maintain a dynamic representation of the growing graph. The use of GNNs allows the model to adapt to various graph structures and properties, offering a mechanism invariant to isomorphisms.

Experiments and Results

The proposed model is rigorously evaluated across multiple tasks:

Synthetic Graphs: The model generates graphs with specific properties such as cycles and trees, outperforming LSTM baselines by demonstrating a higher capability to replicate desired topological features.
Molecular Graphs: On the ChEMBL dataset, the model generates valid and novel molecular structures with high accuracy. By comparing with models based on SMILES strings, this approach shows an increased potential for capturing chemical properties intrinsically.
Order Sensitivity: The paper also explores the effect of node and edge ordering on model performance. Experiments suggest that while canonical orderings can lead to overfitting, random orderings provide more robust models, improving generalization by avoiding overly strict model biases.

Implications and Future Work

The research presents implications both in theory and in practical applications:

Flexibility and Expressiveness: The ability to model graphs without severe structural assumptions allows for broader applicability across domains where relational data is prevalent.
Scalability and Stability: While promising, the approach faces challenges with scalability and training stability due to the dynamic nature of graph structures.

Future directions may include:

Enhancing scalability via optimization techniques that reduce the computational overhead of graph propagation steps.
Exploring potential ordering mechanisms to intelligently guide the node and edge addition process, aligning more closely with natural graph topologies.
Leveraging the model's adaptability in other areas such as knowledge representation or network analysis.

In conclusion, the paper lays foundational work for advanced generative models that inherently respect and utilize graph structures, pushing the boundaries of current generative model applications in AI and machine learning.

PDF Markdown

Related Papers

Find Related Papers