Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection (2103.12450v5)

Published 23 Mar 2021 in cs.CL, cs.AI, and cs.DL

Abstract: The rise of LLMs such as BERT allows for high-quality text paraphrasing. This is a problem to academic integrity, as it is difficult to differentiate between original and machine-generated content. We propose a benchmark consisting of paraphrased articles using recent LLMs relying on the Transformer architecture. Our contribution fosters future research of paraphrase detection systems as it offers a large collection of aligned original and paraphrased documents, a study regarding its structure, classification experiments with state-of-the-art systems, and we make our findings publicly available.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jan Philip Wahle (31 papers)
  2. Terry Ruas (46 papers)
  3. Norman Meuschke (21 papers)
  4. Bela Gipp (98 papers)
Citations (31)