Transition-Based Dependency Parsing with Stack Long Short-Term Memory (1505.08075v1)

Published 29 May 2015 in cs.CL, cs.LG, and cs.NE

Abstract: We propose a technique for learning representations of parser states in transition-based dependency parsers. Our primary innovation is a new control structure for sequence-to-sequence neural networks---the stack LSTM. Like the conventional stack data structures used in transition-based parsing, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. This lets us formulate an efficient parsing model that captures three facets of a parser's state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. Standard backpropagation techniques are used for training and yield state-of-the-art parsing performance.

Citations (798)

View on Semantic Scholar

Summary

The paper introduces stack LSTMs that integrate unbounded look-ahead, complete parser history, and partial tree fragments for efficient state updates.
It combines representations of the input buffer, stack, and action history to predict parser actions while maintaining linear time complexity.
Empirical results show significant improvements with a 93.1% UAS on English and 87.2% UAS on Chinese, outperforming previous neural network approaches.

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

In their paper titled "Transition-Based Dependency Parsing with Stack Long Short-Term Memory," Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith introduce a novel approach to learning representations of parser states in transition-based dependency parsers. This new technique utilizes stack Long Short-Term Memory (LSTM) networks, an innovative control structure for sequence-to-sequence neural networks designed to encapsulate a continuous space embedding of the stack contents, enabling dynamic updates with push and pop operations.

Key Contributions

The core contribution of this paper is the introduction of stack LSTMs, which extend conventional LSTMs by incorporating stack operations. This allows the model to maintain and process the following three facets of a parser’s state:

Unbounded Look-Ahead: The input buffer of incoming words is represented, allowing the parser to consider the entire sequence ahead.
Complete Parser History: The history of the actions taken by the parser is preserved, facilitating stateful decisions based on past actions.
Stack of Partially Built Tree Fragments: The stack that holds partial syntactic structures and their embeddings is managed efficiently.

Parsing Model

The authors extend the line of work utilizing neural networks for transition-based dependency parsing. Their model integrates comprehensive state information, contrasting with prior efforts that relied on narrow, local views of the parser state. The stack LSTM-based parser leverages three such structures for input, stack, and action history representation, all combined to form a state summary that drives parsing decisions.

The parsing process involves initializing the input buffer, stack, and action history to predefined states. The parser computes a composite representation of these elements at each step to predict the next action, updating the internal states accordingly while maintaining linear parsing and training time complexity.

Empirical Results

The paper presents state-of-the-art results on Chinese and English dependency parsing tasks, demonstrating the effectiveness of stack LSTMs in capturing the complexities of parsing states. Specifically:

English: Achieved an Unlabeled Attachment Score (UAS) of 93.1% and Labeled Attachment Score (LAS) of 90.9% on the test set.
Chinese: Achieved a UAS of 87.2% and LAS of 85.7% on the test set.

These outcomes highlight the model’s superiority over prior neural network approaches for transition-based parsing, with substantial performance gains attributed to the global, comprehensive view of the parser state enabled by stack LSTMs.

Implications and Future Directions

The integration of stack LSTMs into dependency parsing introduces a robust mechanism for dealing with the dependencies inherent in sequential data. This architecture's capability to encapsulate extensive state information suggests broader applications in hierarchical and structured data processing. The stack LSTM's adept handling of dynamic state updates with constant-time complexity also promises potential uses in other sequential decision-making problems, ranging from program synthesis to interactive AI systems.

Future work could explore unsupervised parsing by leveraging the stack LSTM's state representation capabilities. Additionally, investigating the application of stack LSTM architectures in multi-dimensional sequence modeling and other NLP tasks could lead to novel approaches for complex language understanding problems. The promising results reported here set a strong foundation for ongoing research into more adaptive and context-aware neural network models.