Suspected Undeclared Use of Artificial Intelligence in the Academic Literature: An Analysis of the Academ-AI Dataset (2411.15218v1)

Published 20 Nov 2024 in cs.DL, cs.AI, and cs.CY

Abstract: Since generative AI tools such as OpenAI's ChatGPT became widely available, researchers have used them in the writing process. The consensus of the academic publishing community is that such usage must be declared in the published article. Academ-AI documents examples of suspected undeclared AI usage in the academic literature, discernible primarily due to the appearance in research papers of idiosyncratic verbiage characteristic of LLM-based chatbots. This analysis of the first 500 examples collected reveals that the problem is widespread, penetrating the journals and conference proceedings of highly respected publishers. Undeclared AI seems to appear in journals with higher citation metrics and higher article processing charges (APCs), precisely those outlets that should theoretically have the resources and expertise to avoid such oversights. An extremely small minority of cases are corrected post publication, and the corrections are often insufficient to rectify the problem. The 500 examples analyzed here likely represent a small fraction of the undeclared AI present in the academic literature, much of which may be undetectable. Publishers must enforce their policies against undeclared AI usage in cases that are detectable; this is the best defense currently available to the academic publishing community against the proliferation of undisclosed AI.

Summary

The paper reveals that undeclared AI use in scholarly texts is detected through specific linguistic markers and patterns.
The study finds that journals with higher citation metrics and APCs are disproportionately impacted by suspected AI-generated content.
The paper emphasizes the need for stronger editorial reviews and advanced AI detection methods to safeguard research integrity.

Analysis of Suspected Undeclared AI in Academic Publications

The paper, titled "Suspected Undeclared Use of Artificial Intelligence in the Academic Literature" by Alex Glynn, critically explores the prevalence of AI-generated content in academic publications which has not been explicitly declared, as is recommended by academic ethics guidelines. This analysis is facilitated by the creation and examination of the Academ-AI dataset, composed of 500 instances of suspected AI-generated text.

The central issue addressed is the incorporation of generative AI models, notably LLMs like OpenAI's ChatGPT, into the academic publishing process without proper disclosure. The academic community has established guidelines that prohibit listing AI as an author and mandate the disclosure of AI usage in the authoring process. Despite such recommendations, the paper provides evidence suggesting that numerous publications, including those from reputable outlets, have contained undeclared AI-generated content, potentially undermining research integrity.

Key Findings

Glynn's investigation categorizes publications based on peculiar textual patterns typical of AI-generated content. These features include first-person singular pronouns, disclaimers about knowledge cutoffs, and phrases like "Certainly, here…" or "Regenerate response," which often betray their AI origin. Unsurprisingly, such text frequently appears conspicuous due to LLMs' conversational tone, lack of access to real-time data, and a habit of referring users to external sources.

From the 500 document dataset, Glynn identifies that publications with higher citation metrics and higher article processing charges (APCs) are disproportionately affected. Statistics reveal that journals listed within the Scimago Journal Rank (SJR) database and indexed by the Directory of Open Access Journals (DOAJ) exhibit higher APCs and citation indices compared to those without AI issues.

Implications and Future Directions

The research underscores the necessity for rigorous editorial reviews and the importance of adhering to AI usage policies to maintain scientific credibility. The paper suggests that current AI detection and review processes need reinforcement, especially considering the difficulty of distinguishing human and AI-generated text. A further implication is the need for AI detection methods to outpace advancements in text generation algorithms, which continue to evolve.

Practically, the studies contained herein suggest bolstering transparency through explicit AI usage declarations could mitigate risks associated with AI-generated inaccuracies or hallucinations. The potential for LLMs to confabulate information emphasizes the risks associated with undisclosed AI use in scientific discussion and reaffirms the collective responsibility of authors and editors to sustain research integrity.

Moving forward, the practical and theoretical exploration of AI's role in scholarly works could be deepened by investigating the extent of “dark AI” wherein AI-generated content that has been edited to elude detection slips through the cracks of peer review. Adequate training for authors and a revision of peer review protocols may be instrumental in future efficacy.

In conclusion, Glynn's analysis is instrumental in spotlighting the potential overreliance on AI in academic writing without disclosure and the associated overtures for academic integrity. As AI capabilities grow, so does the responsibility shared by the academic ecosystem to fortify policies and detect methods encompassing AI's multifaceted contributions to research literature.

PDF Markdown

Related Papers

Tweets

https://twitter.com/SandraV_333/status/1872355366740308293

https://twitter.com/acai_academ_ai/status/1867227132851785821

https://twitter.com/AndrewW66619812/status/1916934674296758545