Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Findings of the First Shared Task on Machine Translation Robustness (1906.11943v2)

Published 27 Jun 2019 in cs.CL

Abstract: We share the findings of the first shared task on improving robustness of Machine Translation (MT). The task provides a testbed representing challenges facing MT models deployed in the real world, and facilitates new approaches to improve models; robustness to noisy input and domain mismatch. We focus on two language pairs (English-French and English-Japanese), and the submitted systems are evaluated on a blind test set consisting of noisy comments on Reddit and professionally sourced translations. As a new task, we received 23 submissions by 11 participating teams from universities, companies, national labs, etc. All submitted systems achieved large improvements over baselines, with the best improvement having +22.33 BLEU. We evaluated submissions by both human judgment and automatic evaluation (BLEU), which shows high correlations (Pearson's r = 0.94 and 0.95). Furthermore, we conducted a qualitative analysis of the submitted systems using compare-mt, which revealed their salient differences in handling challenges in this task. Such analysis provides additional insights when there is occasional disagreement between human judgment and BLEU, e.g. systems better at producing colloquial expressions received higher score from human judgment.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Xian Li (116 papers)
  2. Paul Michel (27 papers)
  3. Antonios Anastasopoulos (111 papers)
  4. Yonatan Belinkov (111 papers)
  5. Nadir Durrani (48 papers)
  6. Orhan Firat (80 papers)
  7. Philipp Koehn (60 papers)
  8. Graham Neubig (342 papers)
  9. Juan Pino (51 papers)
  10. Hassan Sajjad (64 papers)
Citations (60)