Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 75 tok/s

Gemini 2.5 Pro 40 tok/s Pro

GPT-5 Medium 36 tok/s Pro

GPT-5 High 27 tok/s Pro

GPT-4o 97 tok/s Pro

Kimi K2 196 tok/s Pro

GPT OSS 120B 455 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization (2405.04163v2)

Published 7 May 2024 in cs.CL

Abstract: This work presents a dynamic vocabulary adaptation strategy, MEDVOC, for fine-tuning pre-trained LLMs (PLMs) like BertSumAbs, BART, and PEGASUS for improved medical text summarization. In contrast to existing domain adaptation approaches in summarization, MEDVOC treats vocabulary as an optimizable parameter and optimizes the PLM vocabulary based on fragment score conditioned only on the downstream task's reference summaries. Unlike previous works on vocabulary adaptation (limited only to classification tasks), optimizing vocabulary based on summarization tasks requires an extremely costly intermediate fine-tuning step on large summarization datasets. To that end, our novel fragment score-based hyperparameter search very significantly reduces this fine-tuning time -- from 450 days to less than 2 days on average. Furthermore, while previous works on vocabulary adaptation are often primarily tied to single PLMs, MEDVOC is designed to be deployable across multiple PLMs (with varying model vocabulary sizes, pre-training objectives, and model sizes) -- bridging the limited vocabulary overlap between the biomedical literature domain and PLMs. MEDVOC outperforms baselines by 15.74% in terms of Rouge-L in zero-shot setting and shows gains of 17.29% in high Out-Of-Vocabulary (OOV) concentrations. Our human evaluation shows MEDVOC generates more faithful medical summaries (88% compared to 59% in baselines). We make the codebase publicly available at https://github.com/gb-kgp/MEDVOC.

References (47)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces MEDVOC, a method that dynamically adjusts vocabularies in pre-trained language models to better summarize medical texts.
It utilizes a fragment score to identify and optimize key out-of-vocabulary terms by intersecting with medical corpora, reducing fine-tuning time from 450 to 2 days.
Experimental results show a 15.74% average improvement in Rouge-L scores and enhanced factual consistency, highlighting its cross-PLM applicability.

MEDVOC: Dynamic Vocabulary Adaptation for Medical Text Summarization

Introduction

Let’s talk about summarizing medical texts. It’s a handy task, especially in healthcare where concise, accurate summaries of clinical records, health queries, or radiology reports can be incredibly useful. But here’s the kicker: most existing models (like BertSumAbs, BART, and PEGASUS) were initially trained on open-domain texts, which means they’re not very good at handling medical terminology. That's where this paper introduces MEDVOC, a novel method that tweaks the vocabularies of pre-trained LLMs (PLMs) to better summarize medical texts.

How MEDVOC Works

MEDVOC approaches vocabulary as a dynamic, tunable parameter rather than something fixed. It uses the “fragment score,” a clever trick that measures how words are broken down into subwords by a tokenizer. Lower fragment scores mean the tokenizer is doing a good job. Let’s break down how MEDVOC smartly uses this to adapt vocabularies:

Dynamic Vocabulary Construction

Identify Relevant Subwords: MEDVOC first identifies important out-of-vocabulary (OOV) words from the reference summaries that are crucial but poorly tokenized (split into many subwords).
Intersect with a Medical Corpus: It then narrows down this list by intersecting it with a set of frequently occurring medical terms from a large corpus of PubMed articles.
Hyperparameter Tuning: By treating vocabulary size and composition as hyperparameters, MEDVOC runs an efficient search to optimize these values based on the fragment score, skipping the need for extremely lengthy intermediate fine-tuning.

This procedure significantly cuts down the time needed for fine-tuning from a theoretical 450 days to just about 2 days on average.

Key Points

Cross-PLM Applicability: One standout aspect of MEDVOC is its flexibility. Unlike other vocab-adapting methods tied to specific models, MEDVOC can be applied to various PLMs.
Efficient Fine-Tuning: The fragment score-based hyperparameter search allows MEDVOC to avoid the traditionally expensive intermediate fine-tuning steps.
Faithful Summaries: The human evaluation results showed that MEDVOC-generated summaries were significantly more faithful to the source material compared to other baselines.

Experimental Results

The numerical results from the experiments are pretty compelling. In zero-shot scenarios (where the model generates summaries without any task-specific fine-tuning), MEDVOC outperformed the baselines by an average of 15.74% in Rouge-L scores. The method also showed average improvements of 17.29% in situations with high OOV concentrations. Besides Rouge-L, which measures the overall content similarity, MEDVOC also improved on Concept Score, signifying better inclusion of medical concepts.

Practical Implications

Real-Time Summarization: Medical practitioners could potentially use MEDVOC to generate real-time, accurate summaries of medical documents.
Enhanced NLP Tools: Integrating MEDVOC into NLP pipelines could improve the relevance and faithfulness of automatically generated medical records and query responses.

Future Directions

Imagine a world where summarizing complicated medical texts is as easy as pressing a button. This paper certainly moves us closer to that reality. The authors suggest extending MEDVOC into multi-document summarization, which could be another game-changer. Also, since MEDVOC improves the factual consistency of summaries, it might be a valuable addition to current models focused on improving the factual accuracy of textual outputs.

Overall, while MEDVOC is not labeled as revolutionary, it does introduce a thoughtfully designed, effective method for improving medical text summarization. By optimizing vocabularies dynamically and efficiently, MEDVOC sets a new standard in handling domain-specific terms in NLP tasks. It’s a neat piece of work that could have significant real-world applications.