Modelling Language (2404.09579v1)

Published 15 Apr 2024 in cs.CL and cs.AI

Abstract: This paper argues that LLMs have a valuable scientific role to play in serving as scientific models of a language. Linguistic study should not only be concerned with the cognitive processes behind linguistic competence, but also with language understood as an external, social entity. Once this is recognized, the value of LLMs as scientific models becomes clear. This paper defends this position against a number of arguments to the effect that LLMs provide no linguistic insight. It also draws upon recent work in philosophy of science to show how LLMs could serve as scientific models.

Citations (4)

View on Semantic Scholar

Summary

The paper presents a novel argument that LLMs effectively model language as a public E-language, broadening traditional linguistic inquiry.
It demonstrates that rigorous evaluation and XAI methods uncover key syntactic and semantic features correlating with human judgments.
Grindrod refutes criticisms by showing that iterative LLM training processes yield robust data, paving the way for comprehensive social linguistic studies.

The Role of LLMs in Linguistic Inquiry

The paper "Modelling Language" by Jumbly Grindrod, published by the University of Reading, presents a comprehensive argument that LLMs can play a scientific role in the understanding of language as an external, social entity, rather than purely as a cognitive phenomenon. Grindrod's chief assertion is that linguistic paper should encompass language as a public phenomenon—an E-language, as termed by Chomsky (1986).

Insights from Prior Arguments

Grindrod begins by addressing existing viewpoints on whether LLMs can contribute to linguistic inquiry. Computational linguists like Baroni (2022) and Piantadosi (2023) argue affirmatively, suggesting that LLMs, through their predictive and explanatory capabilities, can offer valuable insights. Conversely, critics align with Chomsky, Roberts, and Watumull (2023) and Dupre (2021), who argue that linguistic insights from LLMs are inherently limited by their design and training paradigms, which differ significantly from human linguistic processes.

LLMs as Models of E-Languages

Grindrod posits that while LLMs may not serve as direct theories of linguistic competence, they hold potential as models of E-languages. This conceptual shift detaches LLMs from the expectation of mirroring cognitive linguistic processes and repositions them as practical instruments for studying language as used in communal settings.

Empirical Evidence Supporting LLMs

Grindrod reinforces this proposition by citing multiple studies indicating that LLMs successfully track linguistic features:

Manning et al. (2020) demonstrated that syntactic dependency relations could be extracted from LLMs.
Nair et al. (2020) found that LLMs' contextualized embeddings were predictive of human semantic relatedness judgments.
Shain et al. (2024) indicated that LLM-derived word probabilities matched human surprisal judgments.
Schrimpf et al. (2020) showed internal LLM activations correlated with neural and behavioral responses to text, capturing significant variance in neural data.

Philosophical Underpinnings: Language as a Public Entity

A substantial portion of the paper argues against Chomsky's I-language (internal language) focus, advocating instead for the recognition of E-language (external language) as a legitimate object of paper. This advocacy is grounded in a pluralist ontology, allowing for multiple legitimate objects of linguistic paper, encompassing mental, social, and abstract language objects.

LLMs and Conventional Linguistic Practices

Grindrod asserts that training LLMs on vast corpora of linguistic data allows them to embody linguistic conventions, positioning them as effective models for studying E-languages. This perspective aligns with Lewis's (1983) view of language as a set of social conventions and suggests the empirical basis LLMs provide for tracking these linguistic regularities.

Evaluation and XAI Techniques

Grindrod dissects the training, fine-tuning, and evaluation processes of LLMs, emphasizing that extensive evaluation tasks (such as those in the GLUE benchmark) ensure that LLMs generalize beyond their training data, capturing substantive linguistic features. Recent advancements in explainable AI (XAI) techniques further bolster this argument, illustrating how specific features and behaviors of LLMs can be attributed to their internal configurations.

Responding to Criticisms

A key objection highlighted is the notion that LLMs are merely models of their training corpora. Grindrod counters this by emphasizing the iterative process of LLM training and evaluation, which extends beyond mere pattern replication from training data to capturing general linguistic competencies.

Conclusion and Future Directions

Grindrod's paper ultimately makes a compelling case for reconceptualizing the role of LLMs in linguistic studies. By treating LLMs as models of E-languages, the paper opens new avenues for linguistic inquiry that mesh computational methodologies with traditional linguistic theory. This paradigm shift not only broadens the scope of linguistic research but also potentially leads to novel insights into language as a social system, thus inviting further exploration and refinement of LLMs in this context. Such a reinterpretation ensures that ongoing advancements in computational linguistics are aligned with broader linguistic objectives, creating a richer, multi-faceted understanding of language.

The implications of these findings extend beyond theoretical linguistics, inviting new methodologies and frameworks for practical applications in AI, cognitive science, and social linguistics. As LLMs continue to evolve, their role as scientific proxies for linguistic phenomena will undoubtedly expand, potentially ushering in new paradigms in the paper of language itself.

PDF Markdown

Related Papers

YouTube

Show All Videos