Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Machine-Generated Text Detection using Deep Learning (2311.15425v1)

Published 26 Nov 2023 in cs.CL

Abstract: Our research focuses on the crucial challenge of discerning text produced by LLMs from human-generated text, which holds significance for various applications. With ongoing discussions about attaining a model with such functionality, we present supporting evidence regarding the feasibility of such models. We evaluated our models on multiple datasets, including Twitter Sentiment, Football Commentary, Project Gutenberg, PubMedQA, and SQuAD, confirming the efficacy of the enhanced detection approaches. These datasets were sampled with intricate constraints encompassing every possibility, laying the foundation for future research. We evaluate GPT-3.5-Turbo against various detectors such as SVM, RoBERTa-base, and RoBERTa-large. Based on the research findings, the results predominantly relied on the sequence length of the sentence.

Introduction

The proliferation of LLMs such as GPT-3.5 Turbo has revolutionized various industries by automating content creation. However, this also raises concerns about distinguishing between human and machine-generated text. The integrity of digital communication relies on our ability to make such distinctions.

Related Work

Previous efforts in machine-generated text detection have centered on identifying general characteristics of AI-generated content. Some researchers have proposed watermarking techniques to embed detectable signals in LLM output. Others focused on stylometric detection to differentiate AI-produced tweets by analyzing linguistic features. Additionally, tools like GPTZero utilize metrics such as perplexity and burstiness to identify machine-generated text.

Proposed Approach

The paper proposes a comprehensive approach to discriminate between human and LLM-generated sentences. The researchers build a dataset with a broad spectrum of sentences from different domains. By subjecting this dataset to deep learning models like SVM, RoBERTa-Base, and RoBERTa-Large, the team seeks to identify subtleties in language patterns exclusive to AI-generated content.

Model Description

The SVM model uses a radial basis function kernel with feature representation through TF-IDF to classify the text. While it provides a solid baseline, it is surpassed by RoBERTa-based architectures for more complex tasks. RoBERTa-Base and RoBERTa-Large models, empowered with additional layers, leverage their deep understanding of language context to outperform SVM, especially as sentence length increases.

Experimental Setup

The paper categorizes the dataset across different sentence length ranges to analyze model performance in fine detail. It records the Area Under the Receiver Operating Characteristic curve for each model, providing insights into each model's capability to process varying textual complexity.

Results and Discussion

The experiments demonstrate that RoBERTa models are particularly efficient, with RoBERTa-Large showing dominance in handling complex and longer sentences. SVM performs well but lacks the sophistication of RoBERTa models. The findings underscore the effectiveness of the proposed models in addressing the challenge of determining text origins.

Conclusion and Future Work

The paper validates the possibility of distinguishing between human and ChatGPT-generated text. For future advancements, incorporating a broader dataset and more LLMs may enrich detection capabilities. Exploring other methodologies, such as zero-shot or one-shot learning systems, could also yield more resource-efficient classifiers.

In summary, this investigation into machine-generated text detection advances our understanding of how deep learning can be leveraged to uphold the authenticity of digital communication in the face of increasingly sophisticated AI LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (12)
  1. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  2. On the possibilities of ai-generated text detection. arXiv preprint arXiv:2304.04736.
  3. Machine-generated text: A comprehensive survey of threat models and detection methods. IEEE Access.
  4. Bijoyan Das and Sarit Chakraborty. 2018. An improved text sentiment classification model using tf-idf and next word negation. arXiv preprint arXiv:1806.06407.
  5. GPTZero. 2023. GPTZero Website.
  6. How close is chatgpt to human experts? comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597.
  7. A watermark for large language models. arXiv preprint arXiv:2301.10226.
  8. Stylometric detection of ai-generated text in twitter timelines. arXiv preprint arXiv:2303.03697.
  9. Detectgpt: Zero-shot machine-generated text detection using probability curvature. arXiv preprint arXiv:2301.11305.
  10. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  11. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  12. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Raghav Gaggar (2 papers)
  2. Ashish Bhagchandani (1 paper)
  3. Harsh Oza (2 papers)
Citations (2)