Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection (2305.14902v2)

Published 24 May 2023 in cs.CL

Abstract: LLMs have demonstrated remarkable capability to generate fluent responses to a wide variety of user queries. However, this has also raised concerns about the potential misuse of such texts in journalism, education, and academia. In this study, we strive to create automated systems that can detect machine-generated texts and pinpoint potential misuse. We first introduce a large-scale benchmark \textbf{M4}, which is a multi-generator, multi-domain, and multi-lingual corpus for machine-generated text detection. Through an extensive empirical study of this dataset, we show that it is challenging for detectors to generalize well on instances from unseen domains or LLMs. In such cases, detectors tend to misclassify machine-generated text as human-written. These results show that the problem is far from solved and that there is a lot of room for improvement. We believe that our dataset will enable future research towards more robust approaches to this pressing societal problem. The dataset is available at https://github.com/mbzuai-nlp/M4.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (15)
  1. Yuxia Wang (41 papers)
  2. Jonibek Mansurov (14 papers)
  3. Petar Ivanov (4 papers)
  4. Jinyan Su (20 papers)
  5. Artem Shelmanov (29 papers)
  6. Akim Tsvigun (12 papers)
  7. Chenxi Whitehouse (17 papers)
  8. Osama Mohammed Afzal (9 papers)
  9. Tarek Mahmoud (7 papers)
  10. Alham Fikri Aji (94 papers)
  11. Preslav Nakov (253 papers)
  12. Toru Sasaki (18 papers)
  13. Thomas Arnold (13 papers)
  14. Nizar Habash (66 papers)
  15. Iryna Gurevych (264 papers)
Citations (95)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com