Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts (2309.13202v2)

Published 22 Sep 2023 in cs.CL and cs.AI

Abstract: Biomedical literature often uses complex language and inaccessible professional terminologies. That is why simplification plays an important role in improving public health literacy. Applying NLP models to automate such tasks allows for quick and direct accessibility for lay readers. In this work, we investigate the ability of state-of-the-art LLMs on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}). The methods applied include domain fine-tuning and prompt-based learning (PBL) on: 1) Encoder-decoder models (T5, SciFive, and BART), 2) Decoder-only GPT models (GPT-3.5 and GPT-4) from OpenAI and BioGPT, and 3) Control-token mechanisms on BART-based models. We used a range of automatic evaluation metrics, including BLEU, ROUGE, SARI, and BERTscore, and also conducted human evaluations. BART-Large with Control Token (BART-L-w-CT) mechanisms reported the highest SARI score of 46.54 and T5-base reported the highest BERTscore 72.62. In human evaluation, BART-L-w-CTs achieved a better simplicity score over T5-Base (2.9 vs. 2.2), while T5-Base achieved a better meaning preservation score over BART-L-w-CTs (3.1 vs. 2.6). We also categorised the system outputs with examples, hoping this will shed some light for future research on this task. Our code, fine-tuned models, and data splits are available at \url{https://github.com/HECTA-UoM/PLABA-MU} \begin{IEEEkeywords} LLMs, Text Simplification, Biomedical NLP, Control Mechanisms, Health Informatics \end{IEEEkeywords}

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zihao Li (161 papers)
  2. Samuel Belkadi (9 papers)
  3. Nicolo Micheletti (6 papers)
  4. Lifeng Han (37 papers)
  5. Matthew Shardlow (20 papers)
  6. Goran Nenadic (49 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.