Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling (2009.03954v1)

Published 8 Sep 2020 in cs.CL and cs.NE

Abstract: By positing a relationship between naturalistic reading times and information-theoretic surprisal, surprisal theory (Hale, 2001; Levy, 2008) provides a natural interface between LLMs and psycholinguistic models. This paper re-evaluates a claim due to Goodkind and Bicknell (2018) that a LLM's ability to model reading times is a linear function of its perplexity. By extending Goodkind and Bicknell's analysis to modern neural architectures, we show that the proposed relation does not always hold for Long Short-Term Memory networks, Transformers, and pre-trained models. We introduce an alternate measure of LLMing performance called predictability norm correlation based on Cloze probabilities measured from human subjects. Our new metric yields a more robust relationship between LLM quality and psycholinguistic modeling performance that allows for comparison between models with different training configurations.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Yiding Hao (10 papers)
Simon Mendelsohn (3 papers)
Rachel Sterneck (6 papers)
Randi Martinez (1 paper)
Robert Frank (23 papers)

Citations (41)

View on Semantic Scholar

Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling (2009.03954v1)

Related Papers