Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State (1903.03260v1)

Published 8 Mar 2019 in cs.CL

Abstract: We deploy the methods of controlled psycholinguistic experimentation to shed light on the extent to which the behavior of neural network LLMs reflects incremental representations of syntactic state. To do so, we examine model behavior on artificial sentences containing a variety of syntactically complex structures. We test four models: two publicly available LSTM sequence models of English (Jozefowicz et al., 2016; Gulordava et al., 2018) trained on large datasets; an RNNG (Dyer et al., 2016) trained on a small, parsed dataset; and an LSTM trained on the same small corpus as the RNNG. We find evidence that the LSTMs trained on large datasets represent syntactic state over large spans of text in a way that is comparable to the RNNG, while the LSTM trained on the small dataset does not or does so only weakly.

Authors (6)

Richard Futrell (29 papers)
Ethan Wilcox (24 papers)
Takashi Morita (12 papers)
Peng Qian (39 papers)
Miguel Ballesteros (70 papers)
Roger Levy (43 papers)

Citations (181)

View on Semantic Scholar

Summary

The paper demonstrates that neural language models, especially large LSTMs and the RNNG, reliably encode long-span syntactic states.
The study uses psycholinguistic experiments to reveal models’ sensitivity to NP/Z ambiguities and garden path effects through verb cue processing.
The findings suggest that extensive training data and explicit syntactic cues markedly improve models’ abilities to predict complex sentence structures.

Neural LLMs as Psycholinguistic Subjects: Representations of Syntactic State

The paper titled "Neural LLMs as Psycholinguistic Subjects: Representations of Syntactic State" investigates the capacity for neural LLMs to internally represent syntactic states, traditionally studied in the psycholinguistics domain. The authors utilized psycholinguistic experimentation methods to interrogate these models, focusing on the syntactic representations that help predict subsequent words in a sentence. They particularly examined four distinct neural LLMs—two large LSTM models trained on extensive datasets, a Recursive Neural Network Grammar (RNNG) model, and a smaller LSTM model trained on limited data—to unravel how these models parse complex syntactic structures.

Experimentation and Results

The paper targeted the syntactic processing of subordinate clauses and garden path effects, two pivotal aspects of syntactic complexity, to determine if these LLMs possess representations akin to symbolic grammar-based models. In subordinate clauses, the paper revealed that LSTM models trained on larger datasets (JRNN and GRNN) and the syntactically informed RNNG are capable of maintaining syntactic state representations across lengthy textual spans. Contrarily, TinyLSTM, trained on a smaller dataset, demonstrated limited ability in reflecting syntactic state.

Furthermore, experiments involving NP/Z ambiguity and main verb/reduced relative clause ambiguity elucidated that these models show susceptibility to garden path effects, which are indicative of internal syntactic expectations. Notably, GRNN and JRNN displayed sensitivity to verb transitivity in NP/Z ambiguities, while the RNNG and larger LSTMs exhibited awareness of verb morphology cues in MV/RR ambiguities, albeit imperfectly.

Implications and Future Directions

From a practical standpoint, the findings highlight the importance of large datasets and explicit syntactic representations in achieving comprehensive syntactic understanding within neural models. Theoretical implications suggest a linkage between data scale and the capture of nuanced syntactic phenomena. Future research could explore enhancing syntactic representations through hybrid models incorporating both symbolic and neural techniques, potentially advancing state-of-the-art in AI language comprehension and processing.

The paper also invites further scrutiny into maintaining syntactic state over longer text spans and identifies potential degradation factors that diminish these representations in neural models. Future developments might consider integrating syntactic cues more effectively or leveraging insights from psycholinguistics to refine LLMs' internal mechanisms. This essay underscores both the complexity and the strides being made in understanding syntactic processing in AI, leveraging both extensive datasets and innovative methodologies.

PDF Markdown

Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State (1903.03260v1)

Summary

Neural LLMs as Psycholinguistic Subjects: Representations of Syntactic State

Experimentation and Results

Implications and Future Directions

Related Papers