Papers
Topics
Authors
Recent
Search
2000 character limit reached

Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models

Published 29 Aug 2024 in cs.CL | (2408.16740v1)

Abstract: This paper addresses the conceptual, methodological and technical challenges in studying LLMs and the texts they produce from a quantitative linguistics perspective. It builds on a theoretical framework that distinguishes between the LLM as a substrate and the entities the model simulates. The paper advocates for a strictly non-anthropomorphic approach to models while cautiously applying methodologies used in studying human linguistic behavior to the simulated entities. While natural language processing researchers focus on the models themselves, their architecture, evaluation, and methods for improving performance, we as quantitative linguists should strive to build a robust theory concerning the characteristics of texts produced by LLMs, how they differ from human-produced texts, and the properties of simulated entities. Additionally, we should explore the potential of LLMs as an instrument for studying human culture, of which language is an integral part.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.