Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection (2103.12450v5)

Published 23 Mar 2021 in cs.CL, cs.AI, and cs.DL

Abstract: The rise of LLMs such as BERT allows for high-quality text paraphrasing. This is a problem to academic integrity, as it is difficult to differentiate between original and machine-generated content. We propose a benchmark consisting of paraphrased articles using recent LLMs relying on the Transformer architecture. Our contribution fosters future research of paraphrase detection systems as it offers a large collection of aligned original and paraphrased documents, a study regarding its structure, classification experiments with state-of-the-art systems, and we make our findings publicly available.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (4)

Jan Philip Wahle (31 papers)
Terry Ruas (46 papers)
Norman Meuschke (21 papers)
Bela Gipp (98 papers)

Citations (31)

View on Semantic Scholar

Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection (2103.12450v5)

Related Papers