Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Tailoring Domain Adaptation for Machine Translation Quality Estimation (2304.08891v2)

Published 18 Apr 2023 in cs.CL

Abstract: While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data. For QE in particular, high-quality labeled data is often lacking due to the high cost and effort associated with labeling such data. Aside from the data scarcity challenge, QE models should also be generalizable, i.e., they should be able to handle data from different domains, both generic and specific. To alleviate these two main issues -- data scarcity and domain mismatch -- this paper combines domain adaptation and data augmentation within a robust QE system. Our method first trains a generic QE model and then fine-tunes it on a specific domain while retaining generic knowledge. Our results show a significant improvement for all the language pairs investigated, better cross-lingual inference, and a superior performance in zero-shot learning scenarios as compared to state-of-the-art baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Javad Pourmostafa Roshan Sharami (6 papers)
  2. Dimitar Shterionov (16 papers)
  3. Frédéric Blain (10 papers)
  4. Eva Vanmassenhove (13 papers)
  5. Mirella De Sisto (4 papers)
  6. Chris Emmery (11 papers)
  7. Pieter Spronck (4 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.