Improving Vietnamese-English Medical Machine Translation (2403.19161v1)

Published 28 Mar 2024 in cs.CL

Abstract: Machine translation for Vietnamese-English in the medical domain is still an under-explored research area. In this paper, we introduce MedEV -- a high-quality Vietnamese-English parallel dataset constructed specifically for the medical domain, comprising approximately 360K sentence pairs. We conduct extensive experiments comparing Google Translate, ChatGPT (gpt-3.5-turbo), state-of-the-art Vietnamese-English neural machine translation models and pre-trained bilingual/multilingual sequence-to-sequence models on our new MedEV dataset. Experimental results show that the best performance is achieved by fine-tuning "vinai-translate" for each translation direction. We publicly release our dataset to promote further research.

References (24)

Authors (5)

Nhu Vo (1 paper)
Dat Quoc Nguyen (55 papers)
Dung D. Le (20 papers)
Massimo Piccardi (21 papers)
Wray Buntine (56 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Improving Vietnamese-English Medical Machine Translation (2403.19161v1)

Summary

Related Papers