Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks (1804.00823v4)

Published 3 Apr 2018 in cs.AI, cs.CL, cs.LG, and stat.ML

Abstract: The celebrated Sequence to Sequence learning (Seq2Seq) technique and its numerous variants achieve excellent performance on many tasks. However, many machine learning tasks have inputs naturally represented as graphs; existing Seq2Seq models face a significant challenge in achieving accurate conversion from graph form to the appropriate sequence. To address this challenge, we introduce a novel general end-to-end graph-to-sequence neural encoder-decoder model that maps an input graph to a sequence of vectors and uses an attention-based LSTM method to decode the target sequence from these vectors. Our method first generates the node and graph embeddings using an improved graph-based neural network with a novel aggregation strategy to incorporate edge direction information in the node embeddings. We further introduce an attention mechanism that aligns node embeddings and the decoding sequence to better cope with large graphs. Experimental results on bAbI, Shortest Path, and Natural Language Generation tasks demonstrate that our model achieves state-of-the-art performance and significantly outperforms existing graph neural networks, Seq2Seq, and Tree2Seq models; using the proposed bi-directional node embedding aggregation strategy, the model can converge rapidly to the optimal performance.

Citations (164)

View on Semantic Scholar

Summary

The paper introduces Graph2Seq, a model that directly encodes graph-structured data into sequences using an attention-based LSTM decoder.
It employs bi-directional node embedding aggregation to effectively capture both incoming and outgoing edge information.
The model shows superior performance on large graphs and diverse tasks, demonstrating its robust and flexible design.

Overview of Graph2Seq Model

The advancement of neural network models, particularly in the area of sequence learning, has seen substantial progress in tasks ranging from machine translation to speech recognition. Typically, these models perform well when dealing with inputs that are naturally organized as sequences. However, not all data comes in sequence form—graph-structured data is common, particularly when the relationships between data points are not linear or hierarchical but interconnected in complex ways, as found in social networks or molecule structures.

From Graph to Sequence

The new Graph2Seq model introduced by researchers at IBM Research presents an innovative solution to this challenge. Their method involves an end-to-end, trained model which converts graph-based inputs into sequence outputs. Unlike previous approaches which might ineffectively treat graphs as trees or sequences, leading to loss of information, Graph2Seq directly encodes the input graph. It does so by generating a sequence of vectors and using an attention-based LSTM (Long Short-Term Memory) decoder to produce the target sequence.

This approach consists of a graph encoder that generates embeddings for each node by aggregating information from both the node's features and its neighborhood. To achieve this, the model employs bi-directional node embedding aggregation. This means that each node aggregates information based on two distinct aggregators—one considering the incoming edges (backward nodes) and the other for the outgoing edges (forward nodes).

Coping with Large Graphs

When dealing with large graphs, Graph2Seq utilizes an attention-based method. This allows the model to focus on specific parts of the graph during the decoding process, thus making the model more robust to increasing graph sizes. Such attention mechanisms are crucial when the encoder needs to compress detailed graph information for the decoder to generate the output sequence.

Proving Its Effectiveness

The Graph2Seq model was tested through a series of experiments, and the results demonstrated its superiority over existing models on various synthetic and real-world tasks. The inclusion of an attention mechanism showed a significant performance boost, especially on larger graphs where encoding the entire graph into a fixed-length vector becomes more challenging.

Given its flexibility, the researchers anticipate a broad range of applications for their Graph2Seq model, which could potentially bridge the gap between symbolic AI representations and sequence output tasks. The bi-directional node embedding approach, attention to large graphs, and general-purpose design position Graph2Seq as a strong contender in the space of neural network models for graph-structured data.

PDF Markdown