AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text (2410.04265v1)

Published 5 Oct 2024 in cs.CL

Abstract: Creativity has long been considered one of the most difficult aspect of human intelligence for AI to mimic. However, the rise of LLMs, like ChatGPT, has raised questions about whether AI can match or even surpass human creativity. We present CREATIVITY INDEX as the first step to quantify the linguistic creativity of a text by reconstructing it from existing text snippets on the web. CREATIVITY INDEX is motivated by the hypothesis that the seemingly remarkable creativity of LLMs may be attributable in large part to the creativity of human-written texts on the web. To compute CREATIVITY INDEX efficiently, we introduce DJ SEARCH, a novel dynamic programming algorithm that can search verbatim and near-verbatim matches of text snippets from a given document against the web. Experiments reveal that the CREATIVITY INDEX of professional human authors is on average 66.2% higher than that of LLMs, and that alignment reduces the CREATIVITY INDEX of LLMs by an average of 30.1%. In addition, we find that distinguished authors like Hemingway exhibit measurably higher CREATIVITY INDEX compared to other human writers. Finally, we demonstrate that CREATIVITY INDEX can be used as a surprisingly effective criterion for zero-shot machine text detection, surpassing the strongest existing zero-shot system, DetectGPT, by a significant margin of 30.2%, and even outperforming the strongest supervised system, GhostBuster, in five out of six domains.

PDF HTML Abstract

Analyzing Linguistic Creativity in LLMs: A Review of the Proposed Creativity Index

The paper "AI as Humanity's Salieri" introduces a novel metric, the Creativity Index, for evaluating the linguistic creativity of LLMs relative to human authors. The paper explores the aptitude of LLMs in replicating human creativity and posits that much of what is perceived as machine creativity is heavily reliant on the vast corpus of human-authored texts available on the web. The paper's core motivation lies in quantifying this creativity by comparing machine text reconstructions with the variability found in naturally occurring human literature.

Overview and Key Contributions

The Creativity Index aims to objectively assess creativity by determining how easily a given text can be constructed using snippets from existing texts on the web. This assessment hinges on reconstructing texts from verbatim and semantically similar snippets, a process operationalized through the introduction of a new algorithm called DJ Search. This dynamic programming algorithm efficiently identifies verbatim and near-verbatim text matches in vast datasets, enhancing the computation of the Creativity Index. The core findings indicate that texts from human authors possess a Creativity Index that is, on average, 66.2% higher than the outputs from LLMs.

A key methodological advancement of this research is the DJ Search algorithm. Unlike brute-force methods, DJ Search employs a two-pointer technique to minimize the computational demand and enable scalable creativity assessment for extensive text datasets. Furthermore, the evaluation of linguistic creativity in both machine-generated and human-written texts is enriched by considering semantic similarities, measured through techniques like Word Mover's Distance.

Numerical Results and Insights

The experiments reveal that:

The Creativity Index of texts from professional human authors is significantly higher than that of machine-generated texts, supporting the hypothesis that current LLMs heavily rely on existing human texts rather than generating entirely novel constructs.
The process of Reinforcement Learning from Human Feedback (RLHF), commonly used to align LLM outcomes with human preferences, reduces the Creativity Index of LLM-generated texts by about 30.1%. This outcome suggests an alignment towards predictable and conventional styles at the cost of reduced originality.
The Creativity Index also serves as an effective zero-shot criterion for machine text detection. The paper reports the Creativity Index to outperform existing state-of-the-art techniques such as DetectGPT and GhostBuster, offering a robust method for distinguishing between human and machine-generated texts across varied domains.

Theoretical and Practical Implications

Theoretically, this work pushes the boundaries of understanding AI's limitations in creative tasks, highlighting that true originality continues to be an exemplar of human capability. Practically, the findings hold significant implications for fields relying on textual creativity, such as literature, journalism, and content creation. By providing a scalable metric like the Creativity Index, the authors offer a tool that could guide the development and assessment of text-based generative models.

Furthermore, this research opens new avenues for future AI development by challenging developers to enhance model training techniques beyond mere mimicry of existing datasets. It also urges more profound inquiries into nuanced measures of novelty and invention, potentially steering AI towards innovative generation strategies.

Speculation on Future Developments in AI

As AI systems increasingly integrate into content creation processes, understanding their creative limitations and potential becomes crucial. Future iterations of LLMs might focus on integrating more diverse datasets and adopting training methodologies that emphasize genuine creativity rather than skilled imitation. Additionally, the development of robust evaluation metrics, such as the Creativity Index, will likely play a critical role in ensuring AI-generated content can meet high standards of originality and cultural value.

In conclusion, the paper makes substantial contributions toward understanding machine creativity. By operationalizing the concept of creativity through the Creativity Index, it provides a clear framework for evaluating and enhancing the role of LLMs in creative domains, setting the stage for future innovations in AI-driven creativity.

PDF Markdown Bookmark Chat (Pro)

Authors (11)

Ximing Lu (52 papers)
Melanie Sclar (12 papers)
Skyler Hallinan (11 papers)
Niloofar Mireshghallah (24 papers)
Jiacheng Liu (67 papers)
Seungju Han (33 papers)
Allyson Ettinger (29 papers)
Liwei Jiang (53 papers)
Khyathi Chandu (17 papers)
Nouha Dziri (39 papers)
Yejin Choi (287 papers)

Citations (2)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/GXiming/status/1859773291742798208

https://twitter.com/nouhadziri/status/1860032319735890172

https://twitter.com/GXiming/status/1871263176622555368

https://twitter.com/jd_pressman/status/1861134206392508441

https://twitter.com/GXiming/status/1929973028101992458

https://twitter.com/niloofar_mire/status/1850915358187876373