Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages (2210.11621v1)

Published 20 Oct 2022 in cs.CL, cs.AI, and cs.LG

Abstract: In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation. To overcome the "curse of multilinguality", these models often opt for scaling up the number of parameters, which makes their use in resource-constrained environments challenging. We introduce SMaLL-100, a distilled version of the M2M-100 (12B) model, a massively multilingual machine translation model covering 100 languages. We train SMaLL-100 with uniform sampling across all language pairs and therefore focus on preserving the performance of low-resource languages. We evaluate SMaLL-100 on different low-resource benchmarks: FLORES-101, Tatoeba, and TICO-19 and demonstrate that it outperforms previous massively multilingual models of comparable sizes (200-600M) while improving inference latency and memory usage. Additionally, our model achieves comparable results to M2M-100 (1.2B), while being 3.6x smaller and 4.3x faster at inference. Code and pre-trained models: https://github.com/alirezamshi/small100

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Alireza Mohammadshahi (13 papers)
  2. Vassilina Nikoulina (28 papers)
  3. Alexandre Berard (20 papers)
  4. Caroline Brun (7 papers)
  5. James Henderson (52 papers)
  6. Laurent Besacier (76 papers)
Citations (17)
Youtube Logo Streamline Icon: https://streamlinehq.com