Fine-tuning Large Language Models for Domain-specific Machine Translation (2402.15061v2)

Published 23 Feb 2024 in cs.CL and cs.LG

Abstract: LLMs have shown great potential in domain-specific machine translation (MT). However, one major issue is that LLMs pre-trained on general domain corpus might not generalize well to specific domains due to the lack of domain-specific knowledge. To address this issue, this paper focuses on enhancing the domain-specific MT capability of LLMs, by providing high-quality training datasets and proposing a novel fine-tuning framework denoted by DragFT. DragFT augments LLMs via three techniques: (i) Dictionary-enhanced prompting integrates dictionary information into prompts to improve the translation of domain-specific terminology.; (ii) RAG-based few-shot example selection provides high-quality examples that simulate both the domain and style characteristics; (iii) Fine-tuning with few-shot examples further enhances performance when using in-domain examples. We deploy DragFT on three well-known LLM backbones with 13B training parameters to validate its effectiveness. The results on three domain-specific datasets show that DragFT achieves a significant performance boost and shows superior performance compared to advanced models such as GPT-3.5 and GPT-4o. The drastic performance improvement of DragFT over existing LLMs can be attributed to incorporating relevant knowledge while mitigating noise.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (46)

Authors (7)

Jiawei Zheng (6 papers)
Hanghai Hong (1 paper)
Xiaoli Wang (40 papers)
Jingsong Su (1 paper)
Yonggui Liang (1 paper)
Shikai Wu (2 papers)
Feiyan Liu (2 papers)

Citations (16)

View on Semantic Scholar

Fine-tuning Large Language Models for Domain-specific Machine Translation (2402.15061v2)

Related Papers