Ghostbuster: Detecting Text Ghostwritten by Large Language Models (2305.15047v3)

Published 24 May 2023 in cs.CL and cs.AI

Abstract: We introduce Ghostbuster, a state-of-the-art system for detecting AI-generated text. Our method works by passing documents through a series of weaker LLMs, running a structured search over possible combinations of their features, and then training a classifier on the selected features to predict whether documents are AI-generated. Crucially, Ghostbuster does not require access to token probabilities from the target model, making it useful for detecting text generated by black-box models or unknown model versions. In conjunction with our model, we release three new datasets of human- and AI-generated text as detection benchmarks in the domains of student essays, creative writing, and news articles. We compare Ghostbuster to a variety of existing detectors, including DetectGPT and GPTZero, as well as a new RoBERTa baseline. Ghostbuster achieves 99.0 F1 when evaluated across domains, which is 5.9 F1 higher than the best preexisting model. It also outperforms all previous approaches in generalization across writing domains (+7.5 F1), prompting strategies (+2.1 F1), and LLMs (+4.4 F1). We also analyze the robustness of our system to a variety of perturbations and paraphrasing attacks and evaluate its performance on documents written by non-native English speakers.

PDF Abstract

An Evaluation of Ghostbuster: Detecting AI-Generated Text

The paper "Ghostbuster: Detecting Text Ghostwritten by LLMs" introduces a sophisticated approach to identifying AI-generated text across various domains. This summary critically evaluates the methodology, results, and implications of the paper, aiming to provide insights into its contributions to the field of AI-generated text detection.

The primary contribution of the paper is the introduction of Ghostbuster, a detection system designed to identify whether text was generated by LLMs such as ChatGPT. The core innovation of the system is its unique feature extraction and classification process that does not rely on access to the token probabilities of the target model. Instead, Ghostbuster operates by utilizing probabilities from a series of weaker LLMs, such as unigram and trigram models, as well as early versions of GPT-3 (ada and davinci without instruction tuning). These models provide token probabilities that Ghostbuster uses to conduct a structured search over possible combinations, which are then used to train a linear classifier.

One of the standout numerical results reported in the paper is Ghostbuster's F1 score of 99.0 in detecting AI-generated text across tested datasets, which significantly surpasses alternative models like DetectGPT and GPTZero. It also shows superior generalization across domains (e.g., creative writing, news, student essays), achieving a generalization improvement of 7.5 F1 compared to the best preexisting models. Furthermore, Ghostbuster's performance remains robust under various prompting strategies and across different models, such as Claude-generated text, achieving an F1 score of 92.2 for the latter.

The insights gained from this system have broader theoretical implications. Ghostbuster's design demonstrates the capacity of structured feature exploration and weaker model probabilities to improve the generalization of AI-detection systems. This highlights an important dimension in crafting detection models that can operate effectively without proprietary knowledge of the target LLM's architecture or inner mechanics.

Practically, Ghostbuster can be an essential tool for educators and journalists who need to ensure the authenticity of text, given the increasing prevalence of AI models that generate human-like prose. Its potential applications range from detecting AI-wrought content in academic settings to verifying the originality of news articles. Yet, the paper also points out the limitations and ethical considerations in deploying such systems, particularly the risk of misclassifying text written by non-native English speakers, which warrants cautious application and further development of these technologies.

Speculatively, future developments in AI-generated text detection could focus on making models like Ghostbuster more robust to adversarial changes and shorter texts. Additionally, the pursuit of methods that enhance interpretability and reduce false positives without compromising detection accuracy could be crucial areas for subsequent research.

In conclusion, this paper contributes significantly to the landscape of AI detection methods. Ghostbuster's approach to feature selection and classification sets a benchmark for future systems aiming to detect text generated by black-box models like ChatGPT. While challenges remain, especially in handling nuanced cases of text modifications and non-native English rigor, the paper provides a strong methodological foundation for future advancements in the reliability and applicability of AI-detection systems.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Vivek Verma (4 papers)
Eve Fleisig (14 papers)
Nicholas Tomlin (10 papers)
Dan Klein (99 papers)

Citations (38)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - vivek3141/ghostbuster: Ghostbuster: Detecting Text Ghostwritten by Large Language Models (161 stars)