Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts (2309.12863v1)

Published 22 Sep 2023 in cs.CL and cs.AI

Abstract: Neural machine translation (NMT) has shown impressive performance when trained on large-scale corpora. However, generic NMT systems have demonstrated poor performance on out-of-domain translation. To mitigate this issue, several domain adaptation methods have recently been proposed which often lead to better translation quality than genetic NMT systems. While there has been some continuous progress in NMT for English and other European languages, domain adaption in Arabic has received little attention in the literature. The current study, therefore, aims to explore the effectiveness of domain-specific adaptation for Arabic MT (AMT), in yet unexplored domain, financial news articles. To this end, we developed carefully a parallel corpus for Arabic-English (AR- EN) translation in the financial domain for benchmarking different domain adaptation methods. We then fine-tuned several pre-trained NMT and LLMs including ChatGPT-3.5 Turbo on our dataset. The results showed that the fine-tuning is successful using just a few well-aligned in-domain AR-EN segments. The quality of ChatGPT translation was superior than other models based on automatic and human evaluations. To the best of our knowledge, this is the first work on fine-tuning ChatGPT towards financial domain transfer learning. To contribute to research in domain translation, we made our datasets and fine-tuned models available at https://huggingface.co/asas-ai/.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (65)

Authors (3)

Emad A. Alghamdi (9 papers)
Jezia Zakraoui (2 papers)
Fares A. Abanmy (1 paper)

Citations (1)

View on Semantic Scholar

Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts (2309.12863v1)

Related Papers