Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

University of Cape Town's WMT22 System: Multilingual Machine Translation for Southern African Languages (2210.11757v1)

Published 21 Oct 2022 in cs.CL

Abstract: The paper describes the University of Cape Town's submission to the constrained track of the WMT22 Shared Task: Large-Scale Machine Translation Evaluation for African Languages. Our system is a single multilingual translation model that translates between English and 8 South / South East African Languages, as well as between specific pairs of the African languages. We used several techniques suited for low-resource machine translation (MT), including overlap BPE, back-translation, synthetic training data generation, and adding more translation directions during training. Our results show the value of these techniques, especially for directions where very little or no bilingual training data is available.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Khalid N. Elmadani (5 papers)
  2. Jan Buys (17 papers)
  3. Francois Meyer (9 papers)
Citations (1)