Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Differentiate ChatGPT-generated and Human-written Medical Texts (2304.11567v1)

Published 23 Apr 2023 in cs.CL and cs.AI

Abstract: Background: LLMs such as ChatGPT are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the Internet. However, medical texts such as clinical notes and diagnoses require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses significant harm to healthcare and the general public. Objective: This research is among the first studies on responsible and ethical AIGC (Artificial Intelligence Generated Content) in medicine. We focus on analyzing the differences between medical texts written by human experts and generated by ChatGPT, and designing machine learning workflows to effectively detect and differentiate medical texts generated by ChatGPT. Methods: We first construct a suite of datasets containing medical texts written by human experts and generated by ChatGPT. In the next step, we analyze the linguistic features of these two types of content and uncover differences in vocabulary, part-of-speech, dependency, sentiment, perplexity, etc. Finally, we design and implement machine learning methods to detect medical text generated by ChatGPT. Results: Medical texts written by humans are more concrete, more diverse, and typically contain more useful information, while medical texts generated by ChatGPT pay more attention to fluency and logic, and usually express general terminologies rather than effective information specific to the context of the problem. A BERT-based model can effectively detect medical texts generated by ChatGPT, and the F1 exceeds 95%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Wenxiong Liao (9 papers)
  2. Zhengliang Liu (91 papers)
  3. Haixing Dai (39 papers)
  4. Shaochen Xu (16 papers)
  5. Zihao Wu (100 papers)
  6. Yiyang Zhang (23 papers)
  7. Xiaoke Huang (16 papers)
  8. Dajiang Zhu (68 papers)
  9. Hongmin Cai (18 papers)
  10. Tianming Liu (161 papers)
  11. Xiang Li (1002 papers)
Citations (48)