Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences

Published 7 Jun 2024 in cs.CL | (2406.04988v2)

Abstract: To date, most investigations on surprisal and entropy effects in reading have been conducted on the group level, disregarding individual differences. In this work, we revisit the predictive power of surprisal and entropy measures estimated from a range of LMs on data of human reading times as a measure of processing effort by incorporating information of language users' cognitive capacities. To do so, we assess the predictive power of surprisal and entropy estimated from generative LMs on reading data obtained from individuals who also completed a wide range of psychometric tests. Specifically, we investigate if modulating surprisal and entropy relative to cognitive scores increases prediction accuracy of reading times, and we examine whether LMs exhibit systematic biases in the prediction of reading times for cognitively high- or low-performing groups, revealing what type of psycholinguistic subject a given LM emulates. Our study finds that in most cases, incorporating cognitive capacities increases predictive power of surprisal and entropy on reading times, and that generally, high performance in the psychometric tests is associated with lower sensitivity to predictability effects. Finally, our results suggest that the analyzed LMs emulate readers with lower verbal intelligence, suggesting that for a given target group (i.e., individuals with high verbal intelligence), these LMs provide less accurate predictability estimates.

Abstract PDF HTML Upgrade to Chat

Authors (3)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a novel integration of psychometric scores with LM predictability metrics to improve accuracy in reading time predictions.
It uses linear-mixed models on the InDiCo corpus with several pretrained LMs to capture individual differences in language processing.
Findings reveal that high-capacity individuals show reduced surprisal effects, indicating that LMs may preferentially emulate specific cognitive profiles.

LLMs and Their Interaction with Individual Cognitive Differences in Predictability Measures

This study examines the intersection between LMs and cognitive profiles, exploring how predictability metrics such as surprisal and entropy can enhance our understanding of human reading behavior when individual cognitive differences are considered. Conventional approaches to studying surprisal and entropy effects have primarily focused on group-level analyses, disregarding the variability found among individual readers. This paper pivots to incorporate data on individual cognitive capacities, thus providing novel insights into the predictive power of LMs within the context of human language processing.

Methodology

The researchers integrate psychometric data—encompassing capacities such as verbal and non-verbal working memory, cognitive control, intelligence, and reading fluency—into linear-mixed models to assess the interaction between these variables and predictability measures derived from LMs. The study utilizes the Individual Differences Corpus (InDiCo), which supplies both reading time and psychometric data, and employs five pretrained LMs—GPT-2 base and large, Llama 2 7B and 13B, and Mixtral.

The primary hypotheses explored include:

The enhancement of predictive power related to surprisal and entropy by incorporating cognitive capacities.
The differential reliance on predictive processing by individuals with varying cognitive capabilities, with the expectation that high-capacity individuals exhibit reduced surprisal and entropy effects.
The possibility that LMs might systematically emulate certain cognitive profiles more effectively, impacting their accuracy in predicting reading times.

Key Findings

Predictive Power Enhancement: The interaction between cognitive scores and predictability measures significantly improves reading time predictions across nearly all psychometric tests and models used. Notably, tests related to reading fluency and working memory span produce the highest predictive enhancements.
Reduced Surprisal Effects in High-Capacity Individuals: Negative coefficients in interaction terms for surprisal across most psychometric measures indicate that individuals with higher cognitive scores exhibit smaller surprisal effects, positing that proficient readers may be less susceptible to disruptions from unpredicted textual elements.
Cognitive Profile Emulation: Analysis indicates that LMs tend to better emulate readers with lower verbal intelligence in terms of surprisal and higher working memory capacity in terms of entropy.

Implications and Future Directions

This paper underscores the importance of considering individual cognitive differences when evaluating the efficacy of LMs as models of human language processing. Understanding these variances could refine how predictive models are developed, potentially leading to more personalized language tools capable of accommodating diverse reader profiles.

Further research is warranted to dissect the subtleties of cognitive mechanisms associated with predictability in language processing. Such efforts may entail robust, large-scale samples or more nuanced psychometric assessments. In a practical context, these findings could influence the design of targeted educational technologies and adaptive reading tools, enhancing their effectiveness across varied learner populations. Additionally, addressing the biases in LM architectures might align generated predictions more closely with specific user demographic profiles, optimizing language-based applications for inclusivity and accuracy.

Markdown Report Issue