Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Primer on Pretrained Multilingual Language Models (2107.00676v2)

Published 1 Jul 2021 in cs.CL

Abstract: Multilingual LLMs (\MLLMs) such as mBERT, XLM, XLM-R, \textit{etc.} have emerged as a viable option for bringing the power of pretraining to a large number of languages. Given their success in zero-shot transfer learning, there has emerged a large body of work in (i) building bigger \MLLMs~covering a large number of languages (ii) creating exhaustive benchmarks covering a wider variety of tasks and languages for evaluating \MLLMs~ (iii) analysing the performance of \MLLMs~on monolingual, zero-shot cross-lingual and bilingual tasks (iv) understanding the universal language patterns (if any) learnt by \MLLMs~ and (v) augmenting the (often) limited capacity of \MLLMs~ to improve their performance on seen or even unseen languages. In this survey, we review the existing literature covering the above broad areas of research pertaining to \MLLMs. Based on our survey, we recommend some promising directions of future research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sumanth Doddapaneni (16 papers)
  2. Gowtham Ramesh (6 papers)
  3. Mitesh M. Khapra (79 papers)
  4. Anoop Kunchukuttan (45 papers)
  5. Pratyush Kumar (44 papers)
Citations (68)