SeqXGPT: Sentence-Level AI-Generated Text Detection (2310.08903v2)
Abstract: Widely applied LLMs can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose \textbf{Seq}uence \textbf{X} (Check) \textbf{GPT}, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like \textit{waves} in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.
- wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460.
- Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
- Real or fake? learning to discriminate machine from human generated text. ArXiv, abs/1906.03351.
- Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615–621, New Orleans, Louisiana. Association for Computational Linguistics.
- BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
- GLTR: Statistical detection and visualization of generated text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111–116, Florence, Italy. Association for Computational Linguistics.
- Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991.
- Automatic detection of generated text is easiest when humans are fooled. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1808–1822, Online. Association for Computational Linguistics.
- Is BERT really robust? natural language attack on text classification and entailment. CoRR, abs/1907.11932.
- Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551.
- A watermark for large language models. arXiv preprint arXiv:2301.10226.
- Weight poisoning attacks on pretrained models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2793–2806, Online. Association for Computational Linguistics.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
- Bert-attack: Adversarial attack against bert using bert. arXiv preprint arXiv:2004.09984.
- Backdoor attacks on pre-trained models by layerwise weight poisoning. In Conference on Empirical Methods in Natural Language Processing.
- Origin tracing and detecting of llms. arXiv preprint arXiv:2304.14072.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pages 142–150.
- Detectgpt: Zero-shot machine-generated text detection using probability curvature. ArXiv, abs/2301.11305.
- Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
- OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. http://web.archive.org/web/20230109000707/https://openai.com/blog/chatgpt/. Accessed: 2023-01-10.
- OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
- Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
- Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.
- Release strategies and the social impacts of language models. ArXiv, abs/1908.09203.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Authorship attribution for neural text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8384–8395, Online. Association for Computational Linguistics.
- Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
- Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
- Pengyu Wang and Zhichen Ren. 2022. The uncertainty-based retrieval framework for ancient chinese cws and pos. In Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, pages 164–168.
- Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities.
- Dub: Discrete unit back-translation for speech translation. In Findings of ACL.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.