Adaptive Machine Translation with Large Language Models (2301.13294v3)

Published 30 Jan 2023 in cs.CL

Abstract: Consistency is a key requirement of high-quality translation. It is especially important to adhere to pre-approved terminology and adapt to corrected translations in domain-specific projects. Machine translation (MT) has achieved significant progress in the area of domain adaptation. However, real-time adaptation remains challenging. Large-scale LLMs have recently shown interesting capabilities of in-context learning, where they learn to replicate certain input-output text generation patterns, without further fine-tuning. By feeding an LLM at inference time with a prompt that consists of a list of translation pairs, it can then simulate the domain and style characteristics. This work aims to investigate how we can utilize in-context learning to improve real-time adaptive MT. Our extensive experiments show promising results at translation time. For example, LLMs can adapt to a set of in-domain sentence pairs and/or terminology while translating a new sentence. We observe that the translation quality with few-shot in-context learning can surpass that of strong encoder-decoder MT systems, especially for high-resource languages. Moreover, we investigate whether we can combine MT from strong encoder-decoder models with fuzzy matches, which can further improve translation quality, especially for less supported languages. We conduct our experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES).

PDF Abstract

Adaptive Machine Translation with LLMs

The paper "Adaptive Machine Translation with LLMs" by Yasmin Moslem et al. presents a comprehensive paper on leveraging LLMs for adaptive machine translation (MT). The research focuses extensively on exploring and validating the capabilities of LLMs for translation tasks, particularly emphasizing on in-context learning where the model learns to adapt to specific domain requirements quickly, without the need for further fine-tuning.

Key Findings and Experiments

In-Context Learning in LLMs:
- The authors employ the in-context learning capabilities of LLMs by using GPT-3.5 and related models to perform real-time adaptive MT. They feed the LLMs at inference with prompts containing translation pairs, thereby enabling the model to mimic domain and style characteristics.
Comparison with Encoder-Decoder Models:
- Extensive experiments show that few-shot in-context learning with LLMs can outperform conventional strong encoder-decoder MT systems, especially for high-resource languages. For some language pairs, GPT-3.5 using fuzzy matches achieved better results than conventional encoder-decoder models.
Adaptive Translation Techniques:
- The paper investigates several adaptive techniques including the integration of translation from encoder-decoder models and fuzzy matches. It was found that appending fuzzy matches to new segment MT from other systems significantly enhances quality, especially for low-resource languages.
Terminology Incorporation:
- Terminology extraction in a bilingual setting was explored to assist translation processes. Evaluations indicated the extracted terms were mostly accurate. Incorporating domain-specific terminology, extracted manually or automatically, augmented translation quality effectively.
Bilingual Terminology Extraction:
- Using LLMs for extracting terminology, the research achieved high accuracy in retaining domain-specific terms during translation, further showcasing the model’s ability to incorporate glossaries effectively for terminology-constrained MT.
ChatGPT and BLOOM Models:
- Comparative analysis with other models such as ChatGPT versions and BLOOM elucidates that while open-source LLMs offer varied performance levels, GPT-3.5 generally outperformed BLOOM on translation tasks for multiple languages.

Implications and Future Directions

The paper substantiates the potential of LLMs in transforming machine translation by offering superior adaptive capabilities through in-context learning. This opens up avenues for further exploration into dynamic selection mechanisms for optimized few-shot example inclusion, enabling more precise lexical and syntactic adherence. One could anticipate future improvements in low-resource language pairs by fine-tuning approaches utilizing this foundation.

Furthermore, the integration of well-defined terminologies and higher-level linguistic constraints into LLM-driven MT systems induces more accurate translations, pertinent for professional language service scenarios. The automation and amplification of accurate terminological constraints in translation workflows could significantly boost translation consistency and quality in industry applications.

This research fundamentally underscores the evolving capabilities of LLMs, offering a dynamic bridge between high-quality translation outputs and real-time adaptation requirements. Thus, ensuring adaptive MT systems could be further refined and becoming more viable for mainstream adoption in multilingual communication and global information dissemination contexts.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Yasmin Moslem (13 papers)
Rejwanul Haque (6 papers)
John D. Kelleher (37 papers)
Andy Way (46 papers)

Citations (59)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos