Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs (2104.08692v2)

Published 18 Apr 2021 in cs.CL

Abstract: Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual text-to-text transfer Transformer with translation pairs (mT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially non-autoregressive objective for text-to-text pre-training. We evaluate the methods on eight multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zewen Chi (29 papers)
  2. Li Dong (154 papers)
  3. Shuming Ma (83 papers)
  4. Shaohan Huang Xian-Ling Mao (1 paper)
  5. Heyan Huang (107 papers)
  6. Furu Wei (291 papers)
Citations (67)