Neural AMR: Sequence-to-Sequence Models for Parsing and Generation (1704.08381v3)

Published 26 Apr 2017 in cs.CL

Abstract: Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text usingAbstract Meaning Representation (AMR)has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.

Authors (5)

Ioannis Konstas (40 papers)
Srinivasan Iyer (20 papers)
Mark Yatskar (38 papers)
Yejin Choi (287 papers)
Luke Zettlemoyer (225 papers)

Citations (295)

View on Semantic Scholar

Summary

Sequence-to-Sequence Models for AMR Parsing and Generation

The paper "Neural AMR: Sequence-to-Sequence Models for Parsing and Generation" explores the application of sequence-to-sequence (seq2seq) models in Abstract Meaning Representation (AMR), a formalism to encode the semantics of natural language as directed graphs. The research addresses the challenges associated with parsing and generating text using AMR, chiefly the need for effective linearization and combating data sparsity due to limited annotated datasets.

Overview of Approach

The authors introduce a novel training procedure that capitalizes on a large-scale, unannotated corpus to enhance the performance of seq2seq models. Key steps involve bootstrapping a high-quality AMR parser using a self-training approach that processes millions of sentences from the Gigaword corpus, followed by pretraining an AMR generator on the resulting AMR graphs. This paired training paradigm is crucial in effectively exploiting the combination of unlabeled corpora and scarce annotated data.

Strong Numerical Results

The seq2seq-based AMR parser achieves a SMATCH score of 62.1, while the generator attains a BLEU score of 33.8, illustrating competitive and state-of-the-art performances respectively. These results are significant given the limited reliance on external resources such as knowledge bases or dependency parsing tools, which are commonly utilized by other methods in this domain.

Parsing and Generation Methodology

The seq2seq model employs a stacked LSTM architecture with global attention, adapted to handle the specific task of AMR. Noteworthy is the graph preprocessing strategy that includes anonymization of named entities and transformation of AMR graphs to linear sequences using a depth-first traversal. This preprocessing reduces data complexity and enhances training efficiency.

Comparative Analysis and Ablation Studies

The research includes extensive analysis comparing different linearization and anonymization techniques, revealing that sequence-based AMR models are inherently robust to variations in graph-to-sequence conversions. An ablation paper underscores the importance of preprocessing steps, such as scope marking and anonymization, in achieving high performance.

Implications and Future Directions

Practically, the contributions of this paper hold significant potential for applications in machine translation, summarization, and more. Theoretically, it underscores the viability of seq2seq frameworks for graph-based semantic tasks, traditionally dominated by graph-specific algorithms. Future research can expand on this methodology to other semantic frameworks, potentially even extending cross-lingual capabilities and advancing semantics-based machine translation.

In conclusion, the paper presents a rigorous and efficient approach for handling AMR parsing and generation using seq2seq models. The methods and results signify a valuable stride forward in NLP, opening pathways for more generalized applications across different semantic formalisms.