Multilingual Large Language Models and Curse of Multilinguality (2406.10602v1)

Published 15 Jun 2024 in cs.CL and cs.CY

Abstract: Multilingual LLMs have gained large popularity among NLP researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. This paper navigates the landscape of multilingual LLMs, providing an introductory overview of their technical aspects. It explains underlying architectures, objective functions, pre-training data sources, and tokenization methods. This work explores the unique features of different model types: encoder-only (mBERT, XLM-R), decoder-only (XGLM, PALM, BLOOM, GPT-3), and encoder-decoder models (mT5, mBART). Additionally, it addresses one of the significant limitations of multilingual LLMs - the curse of multilinguality - and discusses current attempts to overcome it.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (3)

Daniil Gurgurov (6 papers)
Tanja Bäumel (2 papers)
Tatiana Anikina (9 papers)

Citations (2)

View on Semantic Scholar

Tweets

https://twitter.com/realmofresearch/status/1803857168275509360

YouTube

Show All Videos

Multilingual Large Language Models and Curse of Multilinguality (2406.10602v1)

Related Papers

Tweets

YouTube