DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text (2305.17359v2)

Published 27 May 2023 in cs.CL and cs.AI

Abstract: LLMs have notably enhanced the fluency and diversity of machine-generated text. However, this progress also presents a significant challenge in detecting the origin of a given text, and current research on detection methods lags behind the rapid evolution of LLMs. Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). Given a text, we first truncate it in the middle and then use only the preceding portion as input to the LLMs to regenerate the new remaining parts. By analyzing the differences between the original and new remaining parts through N-gram analysis in black-box or probability divergence in white-box, we unveil significant discrepancies between the distribution of machine-generated text and the distribution of human-written text. We conducted extensive experiments on the most advanced LLMs from OpenAI, including text-davinci-003, GPT-3.5-turbo, and GPT-4, as well as open-source models such as GPT-NeoX-20B and LLaMa-13B. Results show that our zero-shot approach exhibits state-of-the-art performance in distinguishing between human and GPT-generated text on four English and one German dataset, outperforming OpenAI's own classifier, which is trained on millions of text. Additionally, our methods provide reasonable explanations and evidence to support our claim, which is a unique feature of explainable detection. Our method is also robust under the revised text attack and can additionally solve model sourcing. Codes are available at https://github.com/Xianjun-Yang/DNA-GPT.

PDF Abstract

Divergent N-Gram Analysis for Detecting GPT-Generated Text

The paper "DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text" presents a novel methodology for distinguishing between human-written and machine-generated text produced by LLMs. This innovative approach—Divergent N-Gram Analysis (DNA-GPT)—is designed to circumvent the limitations inherent in traditional training-based text detection methods, particularly as machine-generated text becomes more sophisticated and prevalent.

Overview and Methodology

The authors introduce DNA-GPT, a zero-shot and training-free detection strategy that leverages the inherent differences in text structure and continuity between human and AI-generated content. Unlike training-based approaches that require large datasets and frequent updates, DNA-GPT utilizes N-gram analysis to assess discrepancies in text generation based on the likelihood-maximization criterion often used by LLMs like GPT-3.5 and GPT-4.

DNA-GPT operates by truncating a candidate text midway, regenerating the remainder using a prompt fed into the LLM, and comparing the output with the original continuation using N-gram similarity measures. This approach exploits the observation that given a common starting segment, AI-generated text often has a different distribution pattern compared to human-written text.

Experimental Results

The authors provide extensive experimental validation of DNA-GPT against various state-of-the-art LLMs, including OpenAI's flagship models and open-source variants like GPT-NeoX-20B and LLaMa-13B. The paper employs four English datasets and one German dataset for evaluation. Notably, DNA-GPT consistently achieves superior performance over traditional methods like OpenAI's supervised classifier and GPTZero, demonstrating robustness across various domains with minimal false positives.

In terms of numerical results, DNA-GPT shows high AUROC scores and outstanding true positive rates at a fixed 1% false positive rate, outperforming other detectors significantly. Its efficacy in scenarios where prompts are unknown indicates the flexibility and robustness of the methodology.

Practical and Theoretical Implications

The implications of DNA-GPT are manifold. Practically, it offers a reliable tool for institutions concerned with plagiarism and authenticity verification in academic and professional settings. The model sourcing feature can not only detect the origin of AI-generated text but also suggest potential underlying models, aiding transparency and accountability.

Theoretically, the paper contributes valuable insights into the text generation process and the intrinsic disparities between human-written and LLM-generated content. Such insights could inform future developments in enhancing the interpretability and explainability of AI systems.

Speculation and Future Developments

Looking ahead, the combination of accuracy and explainability in DNA-GPT is promising for non-English applications and contexts where ethical compliance and explainability are paramount. There is intriguing potential for integrating this methodology into broader AI governance frameworks, allowing stakeholders to verify and validate AI contributions with confidence.

Moreover, as LLMs continue to evolve and adapt, DNA-GPT's training-free paradigm may offer a sustainable approach to continuous detection amidst rapid AI advancements. Future work may explore further optimization in computational efficiency and broader applicability in diverse text genres and languages.

In summary, DNA-GPT represents a significant stride in detection methodologies, addressing present challenges with AI text authenticity and ensuring transparency in AI-driven communications.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Xianjun Yang (37 papers)
Wei Cheng (175 papers)
Yue Wu (338 papers)
Linda Petzold (45 papers)
William Yang Wang (254 papers)
Haifeng Chen (99 papers)

Citations (63)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - Xianjun-Yang/DNA-GPT (50 stars)

Tweets

https://twitter.com/Qnolan4/status/1788134638235353556