Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SPOT: Text Source Prediction from Originality Score Thresholding (2405.20505v1)

Published 30 May 2024 in cs.CL and cs.LG

Abstract: The wide acceptance of LLMs has unlocked new applications and social risks. Popular countermeasures aim at detecting misinformation, usually involve domain specific models trained to recognize the relevance of any information. Instead of evaluating the validity of the information, we propose to investigate LLM generated text from the perspective of trust. In this study, we define trust as the ability to know if an input text was generated by a LLM or a human. To do so, we design SPOT, an efficient method, that classifies the source of any, standalone, text input based on originality score. This score is derived from the prediction of a given LLM to detect other LLMs. We empirically demonstrate the robustness of the method to the architecture, training data, evaluation data, task and compression of modern LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Edouard Yvinec (19 papers)
  2. Gabriel Kasser (1 paper)
X Twitter Logo Streamline Icon: https://streamlinehq.com