Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
132 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SeqXGPT: Sentence-Level AI-Generated Text Detection (2310.08903v2)

Published 13 Oct 2023 in cs.CL

Abstract: Widely applied LLMs can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose \textbf{Seq}uence \textbf{X} (Check) \textbf{GPT}, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like \textit{waves} in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460.
  2. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
  3. Real or fake? learning to discriminate machine from human generated text. ArXiv, abs/1906.03351.
  4. Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745.
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  6. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  7. A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615–621, New Orleans, Louisiana. Association for Computational Linguistics.
  8. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
  9. GLTR: Statistical detection and visualization of generated text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111–116, Florence, Italy. Association for Computational Linguistics.
  10. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991.
  11. Automatic detection of generated text is easiest when humans are fooled. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1808–1822, Online. Association for Computational Linguistics.
  12. Is BERT really robust? natural language attack on text classification and entailment. CoRR, abs/1907.11932.
  13. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551.
  14. A watermark for large language models. arXiv preprint arXiv:2301.10226.
  15. Weight poisoning attacks on pretrained models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2793–2806, Online. Association for Computational Linguistics.
  16. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  17. Bert-attack: Adversarial attack against bert using bert. arXiv preprint arXiv:2004.09984.
  18. Backdoor attacks on pre-trained models by layerwise weight poisoning. In Conference on Empirical Methods in Natural Language Processing.
  19. Origin tracing and detecting of llms. arXiv preprint arXiv:2304.14072.
  20. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  21. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pages 142–150.
  22. Detectgpt: Zero-shot machine-generated text detection using probability curvature. ArXiv, abs/2301.11305.
  23. Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
  24. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. http://web.archive.org/web/20230109000707/https://openai.com/blog/chatgpt/. Accessed: 2023-01-10.
  25. OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
  26. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  27. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  28. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
  29. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  30. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.
  31. Release strategies and the social impacts of language models. ArXiv, abs/1908.09203.
  32. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  33. Authorship attribution for neural text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8384–8395, Online. Association for Computational Linguistics.
  34. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
  35. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  36. Pengyu Wang and Zhichen Ren. 2022. The uncertainty-based retrieval framework for ancient chinese cws and pos. In Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, pages 164–168.
  37. Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities.
  38. Dub: Discrete unit back-translation for speech translation. In Findings of ACL.
  39. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
Citations (35)

Summary

  • The paper introduces a novel detection paradigm that leverages word-wise log probabilities processed by CNNs and self-attention for precise sentence-level AI text detection.
  • It proposes the SeqXGPT-Bench dataset featuring mixed human and AI-generated sentences to enable fine-grained evaluation across multiple detection tasks.
  • Experiments demonstrate that SeqXGPT outperforms baseline methods like DetectGPT, showing strong generalization and enhanced security in identifying AI contributions.

SeqXGPT: Sentence-Level AI-Generated Text Detection

The paper "SeqXGPT: Sentence-Level AI-Generated Text Detection" presents a novel approach to addressing the challenge of detecting AI-generated text at the sentence level. With the wide application of LLMs capable of generating human-like text, the paper highlights the growing concern about potential misuse and emphasizes the need for robust detection mechanisms.

Overview

The authors introduce a sentence-level detection paradigm in response to the limitations of existing document-level methods. They propose a new dataset, SeqXGPT-Bench, containing documents with both human-written and AI-generated sentences to facilitate this fine-grained detection. The paper investigates various detection tasks, including particular-model binary, mixed-model binary, and mixed-model multiclass AIGT detection, and emphasizes the difficulty of sentence-level detection due to short text inputs.

Methodology

SeqXGPT is presented as a dedicated solution for sentence-level detection, leveraging word-wise log probability lists extracted from LLMs as foundational features. These features are processed using convolutional and self-attention networks, resembling the handling of wave-like temporal features in speech processing. By employing sequence labeling for text provenance analysis, SeqXGPT exhibits superior performance in predicting sentence origins compared to existing approaches.

Experimental Findings

Extensive experimentation demonstrates SeqXGPT's efficacy over previous methods. The approach surpasses baseline models, such as log probability-based and perturbation-based methods like DetectGPT, particularly in the sentence-level detection challenge. SeqXGPT also shows promising results in document-level detection and maintains strong generalization across out-of-distribution datasets. The paper details ablation studies that underscore the importance of combining CNNs with self-attention layers for effective detection performance.

Implications and Future Directions

The paper's contribution provides a significant step towards more precise AI-generated text detection, offering potential for enhancing security and trust in LLM applications. Though focusing heavily on structural features through log probabilities, further research could explore the integration of semantic features to improve detection in human-like sentence generation scenarios. Moreover, the role of various instructional settings in AI-generated text remains to be investigated, which could enhance the adaptability and robustness of SeqXGPT in real-world applications where AI and human-authored content coexist.

Conclusion

"SeqXGPT: Sentence-Level AI-Generated Text Detection" addresses an emerging need for refined detection capabilities amidst increasingly sophisticated AI text generators. By achieving finer granularity in text detection, it sets a foundation for future exploration into the amalgamation of semantic and structural features in identifying AI contributions to human-composed documents, ensuring that detection systems remain ahead in an evolving landscape of AI-assisted content creation.

Github Logo Streamline Icon: https://streamlinehq.com