Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders (2004.06575v1)

Published 14 Apr 2020 in cs.CL

Abstract: State-of-the-art multilingual machine translation relies on a universal encoder-decoder, which requires retraining the entire system to add new languages. In this paper, we propose an alternative approach that is based on language-specific encoder-decoders, and can thus be more easily extended to new languages by learning their corresponding modules. So as to encourage a common interlingua representation, we simultaneously train the N initial languages. Our experiments show that the proposed approach outperforms the universal encoder-decoder by 3.28 BLEU points on average, and when adding new languages, without the need to retrain the rest of the modules. All in all, our work closes the gap between shared and language-specific encoder-decoders, advancing toward modular multilingual machine translation systems that can be flexibly extended in lifelong learning settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Carlos Escolano (20 papers)
  2. Marta R. Costa-jussà (73 papers)
  3. José A. R. Fonollosa (23 papers)
  4. Mikel Artetxe (52 papers)
Citations (51)

Summary

We haven't generated a summary for this paper yet.