Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text (2407.11774v1)

Published 16 Jul 2024 in cs.CL and cs.AI

Abstract: Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While LLMs generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural LLMs. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately discerning MGTs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Seyedeh Fatemeh Ebrahimi (3 papers)
  2. Karim Akhavan Azari (2 papers)
  3. Amirmasoud Iravani (2 papers)
  4. Arian Qazvini (1 paper)
  5. Pouya Sadeghi (6 papers)
  6. Zeinab Sadat Taghavi (8 papers)
  7. Hossein Sameti (19 papers)
Citations (1)