Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiscale Positive-Unlabeled Detection of AI-Generated Texts (2305.18149v4)

Published 29 May 2023 in cs.CL and cs.AI

Abstract: Recent releases of LLMs, e.g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts. Previous works proposed methods to detect these AI-generated texts, including simple ML classifiers, pretrained-model-based zero-shot methods, and finetuned language classification models. However, mainstream detectors always fail on short texts, like SMSes, Tweets, and reviews. In this paper, a Multiscale Positive-Unlabeled (MPU) training framework is proposed to address the difficulty of short-text detection without sacrificing long-texts. Firstly, we acknowledge the human-resemblance property of short machine texts, and rephrase AI text detection as a partial Positive-Unlabeled (PU) problem by regarding these short machine texts as partially ``unlabeled". Then in this PU context, we propose the length-sensitive Multiscale PU Loss, where a recurrent model in abstraction is used to estimate positive priors of scale-variant corpora. Additionally, we introduce a Text Multiscaling module to enrich training corpora. Experiments show that our MPU method augments detection performance on long AI-generated texts, and significantly improves short-text detection of LLM detectors. LLMs trained with MPU could outcompete existing detectors on various short-text and long-text detection benchmarks. The codes are available at https://github.com/mindspore-lab/mindone/tree/master/examples/detect_chatgpt and https://github.com/YuchuanTian/AIGC_text_detector.

AI-Generated Text Detection Using Multiscale Positive-Unlabeled Learning

The paper "Multiscale Positive-Unlabeled Detection of AI-Generated Texts" offers a novel approach to address the considerable challenges faced in detecting AI-generated texts, particularly short texts. It presents the Multiscale Positive-Unlabeled (MPU) training framework, an innovative methodology designed to enhance the detection performance on short texts while also maintaining efficacy for longer ones. This approach is essential due to the increasing sophistication of LLMs such as GPT-4, which can generate human-like text that complicates the task of distinguishing it from human-authored content.

Problem Context

AI-generated texts can be misleading, especially when used in unethical or illegal contexts. While existing methods, like simple classifiers or finetuned models, perform reasonably well for longer texts, they frequently fail on shorter texts such as tweets or SMS messages. These short texts are ubiquitous in today's digital communication landscape, prompting the need for improved detection methods.

Approach and Methodology

This research distinguishes itself by reframing AI text detection as a Positive-Unlabeled (PU) problem where short AI-generated texts are treated as "unlabeled" due to their high resemblance to human texts. The proposed MPU training framework leverages a Multiscale PU loss that adjusts based on the length of the text, allowing it to address discrepancies in text detection across varying lengths. Specifically:

  • Multiscale PU Loss: This is a length-sensitive loss function that estimates positive priors differently for texts of varying lengths, using a recurrent model in abstraction. The recurrent model is designed to capture human-likeness in texts progressively, based on token-wide signals.
  • Text Multiscaling Module: This module augments the dataset by generating multiple length variations of training texts through random sentence deletion. This step is key to ensuring that the model is exposed to texts of all lengths during training.

These components combine to significantly improve the detection of short AI-generated texts without compromising the performance on longer ones.

Results

The effectiveness of the MPU method was validated through experiments on datasets like TweepFake and HC3, spanning languages including English and Chinese. Remarkably, the MPU method outperformed leading baselines in detecting AI-generated texts, even competing with newer approaches like DetectGPT. Notably, on short-text benchmarks such as HC3-English-Sentence, MPU enhanced the F1 score substantially, evidencing improved detector performance for short text classifications.

Implications and Future Directions

This research holds meaningful implications for the future of AI text detection and the broader field of AI ethics. By enhancing the accuracy of detectors for shorter texts, the MPU framework provides an advanced tool for combating misinformation and protecting against social engineering attacks using AI-generated content. The work also suggests avenues for further exploration, such as refining the instantiation of length-sensitive priors or expanding this framework to other modalities of languages or semi-structured data.

The introduction of a framework that caters to the nuanced task of multiscale text detection also raises questions about the potential application of comparable PU learning strategies in other domains within AI, where data labeling challenges parallel those in text detection. Future research may explore unsupervised and semi-supervised learning paradigms, further refining detection capabilities in rapidly evolving contexts.

In conclusion, the paper delineates a practical and theoretically informed methodology that pushes forward the capabilities of AI-generated text detection, aligning well with the demands of modern digital media environments. As LLMs become more sophisticated, frameworks like MPU will be crucial in maintaining the integrity and authenticity of digital communication.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Generating sentiment-preserving fake online reviews using neural language models and their human- and machine-based detection. In Leonard Barolli, Flora Amato, Francesco Moscato, Tomoya Enokido, and Makoto Takizawa, editors, Advanced Information Networking and Applications - Proceedings of the 34th International Conference on Advanced Information Networking and Applications, AINA-2020, Caserta, Italy, 15-17 April, volume 1151 of Advances in Intelligent Systems and Computing, pages 1341–1354. Springer, 2020. doi: 10.1007/978-3-030-44041-1_114. URL https://doi.org/10.1007/978-3-030-44041-1_114.
  2. Learning from positive and unlabeled data: A survey. Machine Learning, 109:719–760, 2020.
  3. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nature methods, 16(11):1153–1160, 2019.
  4. Language models are few-shot learners. CoRR, abs/2005.14165, 2020. URL https://arxiv.org/abs/2005.14165.
  5. Self-pu: Self boosted and calibrated positive-unlabeled training. In International Conference on Machine Learning, pages 1510–1519. PMLR, 2020.
  6. Adversarial robustness of neural-statistical features in detection of generative transformers. In International Joint Conference on Neural Networks, IJCNN 2022, Padua, Italy, July 18-23, 2022, pages 1–8. IEEE, 2022. doi: 10.1109/IJCNN55064.2022.9892269. URL https://doi.org/10.1109/IJCNN55064.2022.9892269.
  7. Revisiting pre-trained models for Chinese natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 657–668, Online, November 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.findings-emnlp.58.
  8. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805, 2018. URL http://arxiv.org/abs/1810.04805.
  9. Analysis of learning from positive and unlabeled data. Advances in neural information processing systems, 27, 2014.
  10. Tweepfake: about detecting deepfake tweets. CoRR, abs/2008.00036, 2020. URL https://arxiv.org/abs/2008.00036.
  11. FudanNLPLab. Sniffer. Website, 2023. sniffer.fastnlp.top.
  12. GLTR: statistical detection and visualization of generated text. In Marta R. Costa-jussà and Enrique Alfonseca, editors, Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28 - August 2, 2019, Volume 3: System Demonstrations, pages 111–116. Association for Computational Linguistics, 2019. doi: 10.18653/v1/p19-3019. URL https://doi.org/10.18653/v1/p19-3019.
  13. How close is chatgpt to human experts? comparison corpus, evaluation, and detection. CoRR, abs/2301.07597, 2023. doi: 10.48550/arXiv.2301.07597. URL https://doi.org/10.48550/arXiv.2301.07597.
  14. Learning from positive and unlabeled data with arbitrary positive shift. Advances in Neural Information Processing Systems, 33:13088–13099, 2020.
  15. Instance-dependent pu learning by bayesian optimal relabeling. arXiv preprint arXiv:1808.02180, 2018.
  16. Pu learning for matrix completion. In International conference on machine learning, pages 2445–2453. PMLR, 2015.
  17. Positive and unlabeled learning in categorical data. Neurocomputing, 196:113–124, 2016.
  18. Opinion mining using ensemble text hidden markov models for text classification. Expert Syst. Appl., 94:218–227, 2018. doi: 10.1016/j.eswa.2017.07.019. URL https://doi.org/10.1016/j.eswa.2017.07.019.
  19. Learning from positive and unlabeled data with a selection bias. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=rJzLciCqKm.
  20. Positive-unlabeled learning with non-negative risk estimator. Advances in neural information processing systems, 30, 2017.
  21. Stylometric detection of ai-generated text in twitter timelines. CoRR, abs/2303.03697, 2023. doi: 10.48550/arXiv.2303.03697. URL https://doi.org/10.48550/arXiv.2303.03697.
  22. Building text classifiers using positive and unlabeled examples. In Third IEEE international conference on data mining, pages 179–186. IEEE, 2003.
  23. Coco: Coherence-enhanced machine-generated text detection under data limitation with contrastive learning. CoRR, abs/2212.10341, 2022. doi: 10.48550/arXiv.2212.10341. URL https://doi.org/10.48550/arXiv.2212.10341.
  24. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019. URL http://arxiv.org/abs/1907.11692.
  25. Recurrent neural network based language model. In Interspeech, volume 2, pages 1045–1048. Makuhari, 2010.
  26. Detectgpt: Zero-shot machine-generated text detection using probability curvature. CoRR, abs/2301.11305, 2023. doi: 10.48550/arXiv.2301.11305. URL https://doi.org/10.48550/arXiv.2301.11305.
  27. Chatgpt or human? detect and explain. explaining decisions of machine learning model for detecting short chatgpt-generated text. CoRR, abs/2301.13852, 2023. doi: 10.48550/arXiv.2301.13852. URL https://doi.org/10.48550/arXiv.2301.13852.
  28. OpenAI. Introducing chatgpt. Website, 2022. https://openai.com/blog/chatgpt.
  29. OpenAI. Gpt-4 technical report, 2023a.
  30. OpenAI. Ai text classifier - openai api. Website, January 2023b. https://platform.openai.com/ai-text-classifier.
  31. Distantly supervised named entity recognition using positive-unlabeled learning. arXiv preprint arXiv:1906.01378, 2019.
  32. Language models are unsupervised multitask learners. 2019.
  33. Laplacian unit-hyperplane learning from positive and unlabeled examples. Information Sciences, 314:152–168, 2015.
  34. Release strategies and the social impacts of language models. CoRR, abs/1908.09203, 2019. URL http://arxiv.org/abs/1908.09203.
  35. Positive-unlabeled learning from imbalanced data. In IJCAI, pages 2995–3001, 2021.
  36. LSTM neural networks for language modeling. In INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pages 194–197. ISCA, 2012. URL http://www.isca-speech.org/archive/interspeech_2012/i12_0194.html.
  37. Positive-unlabeled learning with adversarial data augmentation for knowledge graph completion. arXiv preprint arXiv:2205.00904, 2022.
  38. Edward Tian. Gptzero. Website, 2022. https://gptzero.me/faq.
  39. EDA: easy data augmentation techniques for boosting performance on text classification tasks. CoRR, abs/1901.11196, 2019. URL http://arxiv.org/abs/1901.11196.
  40. Positive-unlabeled compression on the cloud. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 2561–2570, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/ac796a52db3f16bbdb6557d3d89d1c5a-Abstract.html.
  41. Defending against neural fake news. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 9051–9062, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/3e9f0fc9b2f89e043bc6233994dfcf76-Abstract.html.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuchuan Tian (11 papers)
  2. Hanting Chen (52 papers)
  3. Xutao Wang (3 papers)
  4. Zheyuan Bai (5 papers)
  5. Qinghua Zhang (116 papers)
  6. Ruifeng Li (21 papers)
  7. Chao Xu (283 papers)
  8. Yunhe Wang (145 papers)
Citations (34)
Github Logo Streamline Icon: https://streamlinehq.com