Recurrent Neural Network Grammars (1602.07776v4)

Published 25 Feb 2016 in cs.CL and cs.NE

Abstract: We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure. We explain efficient inference procedures that allow application to both parsing and LLMing. Experiments show that they provide better parsing in English than any single previously published supervised generative model and better LLMing than state-of-the-art sequential RNNs in English and Chinese.

Citations (521)

View on Semantic Scholar

Summary

The paper presents a novel model that integrates RNNs with explicit phrase-structure grammar to directly generate parse trees.
It employs a top-down, transition-based parsing algorithm that efficiently captures hierarchical syntactic structures.
The RNNG achieves state-of-the-art parsing and competitive language modeling performance on both English and Chinese datasets.

Recurrent Neural Network Grammars: An Overview

The paper introduces Recurrent Neural Network Grammars (RNNGs), a novel probabilistic model designed to generate hierarchical syntactic structures of sentences efficiently. The RNNG framework aims to address the limitation inherent in traditional sequential models, which often overlook the nested, structural relationships pivotal in natural language processing.

Key Features of RNNGs

RNNGs integrate the strengths of Recurrent Neural Networks (RNNs) with explicit phrase structure modeling. Unlike conventional RNN-based LLMs, which process sentences as flat sequences, RNNGs employ a top-down parsing strategy that closely resembles the recursive generation processes found in probabilistic context-free grammars. RNNGs utilize a transition-based parsing technique, maintaining operational efficiency while enhancing structural representation.

The grammatical foundation is laid by defining RNNGs as a triple consisting of nonterminal and terminal symbol sets along with neural network parameters. This definition circumvents traditional rule declaration by characterizing rules implicitly through the network's parameters.

Model Architecture

The model encompasses a stack-based algorithm with three principal transitions: nonterminal introduction, terminal shift, and constituent reduction. These transitions allow the model to generate parse trees directly while encoding syntactic decisions using RNNs. Each of the stack, output buffer, and action history is represented using advanced embedding techniques, allowing the RNNG to capture intricate dependencies.

An integral part of the RNNG mechanism is the syntactic composition function based on bidirectional LSTMs. This function compiles subtree embeddings, ensuring comprehensive information propagation across generated constituents, thereby enhancing parsing accuracy and LLMing performance.

Experimental Results

The paper presents strong results for RNNGs in both parsing and LLMing domains across English and Chinese datasets:

Parsing Performance: On the Penn Treebank dataset, the generative RNNG model achieves superior F-scores compared to prior single supervised generative models. A corrected implementation further enhances its parsing accuracy, setting a new benchmark in phrase-structure parsing. Similarly, in the Chinese Treebank, RNNG surpasses previous state-of-the-art results, highlighting its linguistic adaptability across languages.
LLMing: RNNGs demonstrate competitive perplexities that outperform sophisticated sequential LSTM baselines, reinforcing the benefits of incorporating hierarchical syntax within neural LLMs.

Implications and Future Directions

The paper suggests that RNNGs' ability to jointly model tree structures and language sequences can advance both theoretical and practical AI developments. The potential to leverage top-down grammatical insights in broader applications such as machine translation and processing efficiency enhancements is particularly promising.

The introduction of importance sampling techniques for inference allows approximation of otherwise computationally expensive marginal probabilities. This showcases how discriminative parsing approaches can be effectively integrated to assist generative models in estimating sentence likelihoods.

Looking forward, RNNGs open avenues for exploration in unsupervised learning, syntax-informed sentence processing models, and extensions to other structured prediction tasks. By facilitating nuanced understanding and generation of language, RNNGs significantly contribute to the evolution of syntactically aware neural network models.

PDF Markdown