Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation (2008.06788v2)

Published 15 Aug 2020 in cs.CL

Abstract: Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU). The recent advent of end-to-end neural models, self-supervised via LLMing (LM), and their success on a wide range of LU tasks, however, questions this belief. In this work, we empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks. Relying on the established fine-tuning paradigm, we first couple a pretrained transformer with a biaffine parsing head, aiming to infuse explicit syntactic knowledge from Universal Dependencies treebanks into the transformer. We then fine-tune the model for LU tasks and measure the effect of the intermediate parsing training (IPT) on downstream LU task performance. Results from both monolingual English and zero-shot language transfer experiments (with intermediate target-language parsing) show that explicit formalized syntax, injected into transformers through IPT, has very limited and inconsistent effect on downstream LU performance. Our results, coupled with our analysis of transformers' representation spaces before and after intermediate parsing, make a significant step towards providing answers to an essential question: how (un)availing is supervised parsing for high-level semantic natural language understanding in the era of large neural models?

PDF Abstract

An Empirical Investigation into the Utility of Supervised Syntactic Parsing for Language Understanding Tasks

The paper "Is Supervised Syntactic Parsing Beneficial for Language Understanding Tasks? An Empirical Investigation" conducts a rigorous examination of the prevailing assumption in the field of NLP that supervised syntactic parsing is integral to semantic language understanding (LU). The authors scrutinize this assumption by measuring the impact of explicit syntactic knowledge on pretrained transformer networks' performance in various LU tasks.

Background and Motivation

Historically, NLP tasks have leveraged supervised syntactic parsing as a critical component for understanding language. Parsing facilitates the structural analysis of sentences, which was believed to be essential for semantic understanding. However, the advent of large-scale neural models, particularly those employing transformer architectures and pretrained through LLMing (LM) objectives, poses a challenge to this belief. Such models, including BERT, RoBERTa, and XLM-R, are shown to achieve impressive results across a multitude of LU tasks without exposure to explicit syntactic structures.

Methodology

The authors employ a comprehensive experimental setup involving intermediate parsing training (IPT) wherein a pretrained transformer is fine-tuned using a biaffine parsing head to inject syntactic knowledge derived from Universal Dependencies (UD) treebanks into the transformer. Subsequently, the syntactically-informed transformers are further fine-tuned for downstream LU tasks.

The research encompasses both monolingual and zero-shot language transfer experiments. Monolingual experiments utilize English-specific transformers and treebanks, while zero-shot transfer experiments involve multilingual models, incorporating additional parsing training in target languages where no task-specific training data is available.

Findings

The results of the paper reveal that supervised syntactic parsing has limited effects on downstream LU tasks post-training. The authors note the following observations:

Monolingual transformers, after exposure to IPT, display minimal improvements in LU performance compared to their baseline counterparts.
Zero-shot language transfer experiments also show inconsistent and negligible gains, even after additional parsing training.
Interestingly, some zero-shot transfer tasks see minor performance enhancements after IPT. However, these improvements are largely attributed to simply exposing models to more language data during parsing, rather than acquiring syntactic structures.

Analysis and Implications

Through an examination of changes in representation space topology using linear centered kernel alignment (l-CKA), the work elucidates that explicit syntactic knowledge alters representation spaces of transformers. However, the type of syntactic information obtained through UD parsing does not substantially align with the structural knowledge that benefits semantic LU tasks.

This observation casts doubt on the efficacy of supervised syntactic parsing in enhancing high-level language understanding. The authors posit that the redundancy of parsing with respect to structural information implicitly captured by large transformer models questions the necessity of parsing for modern semantic LU applications.

Future Perspectives

This empirical investigation lays groundwork for further discourse on the integration of formal syntactic knowledge in large neural models and invites subsequent inquiries into the compatibility and necessity of syntactic structures within semantic LU contexts. The paper encourages a reevaluation of inductive biases inherent in LU systems, particularly in scenarios with abundant language data.

Conclusively, while supervised syntactic parsing may not augment LU tasks in the context of today's capable transformer models, it continues to provide valuable insights within computational linguistics and areas devoid of high-resource language data. Future research should aim to unravel how much formal syntax can contribute to LU performance, exploring alternative approaches for infusing synoptic linguistic information into neural architectures.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Goran Glavaš (82 papers)
Ivan Vulić (130 papers)

Citations (63)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos