When AI Eats Itself: On the Caveats of AI Autophagy (2405.09597v3)

Published 15 May 2024 in cs.LG and cs.AI

Abstract: Generative AI technologies and large models are producing realistic outputs across various domains, such as images, text, speech, and music. Creating these advanced generative models requires significant resources, particularly large and high-quality datasets. To minimise training expenses, many algorithm developers use data created by the models themselves as a cost-effective training solution. However, not all synthetic data effectively improve model performance, necessitating a strategic balance in the use of real versus synthetic data to optimise outcomes. Currently, the previously well-controlled integration of real and synthetic data is becoming uncontrollable. The widespread and unregulated dissemination of synthetic data online leads to the contamination of datasets traditionally compiled through web scraping, now mixed with unlabeled synthetic data. This trend, known as the AI autophagy phenomenon, suggests a future where generative AI systems may increasingly consume their own outputs without discernment, raising concerns about model performance, reliability, and ethical implications. What will happen if generative AI continuously consumes itself without discernment? What measures can we take to mitigate the potential adverse effects? To address these research questions, this study examines the existing literature, delving into the consequences of AI autophagy, analyzing the associated risks, and exploring strategies to mitigate its impact. Our aim is to provide a comprehensive perspective on this phenomenon advocating for a balanced approach that promotes the sustainable development of generative AI technologies in the era of large models.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (11)

Xiaodan Xing (35 papers)
Fadong Shi (2 papers)
Jiahao Huang (93 papers)
Yinzhe Wu (30 papers)
Yang Nan (40 papers)
Sheng Zhang (212 papers)
Yingying Fang (20 papers)
Mike Roberts (9 papers)
Carola-Bibiane Schönlieb (276 papers)
Javier Del Ser (100 papers)
Guang Yang (422 papers)

Tweets

https://twitter.com/gorgikrlev/status/1836762882450620799

https://twitter.com/boobs_scary/status/1843451383145783318

https://twitter.com/gastronomy/status/1791318662265049559

https://twitter.com/MachMindMusings/status/1794151558751388136

https://twitter.com/devgerred/status/1873078916774281343

YouTube

Show All Videos

When AI Eats Itself: On the Caveats of AI Autophagy (2405.09597v3)

Related Papers

Tweets

YouTube