Multilingual Large Language Models Are Not (Yet) Code-Switchers (2305.14235v2)

Published 23 May 2023 in cs.CL and cs.AI

Abstract: Multilingual LLMs have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within an utterance, remains relatively uncharted. In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification. Our results indicate that despite multilingual LLMs exhibiting promising outcomes in certain tasks using zero or few-shot prompting, they still underperform in comparison to fine-tuned models of much smaller scales. We argue that current "multilingualism" in LLMs does not inherently imply proficiency with code-switching texts, calling for future research to bridge this discrepancy.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (5)

Ruochen Zhang (21 papers)
Samuel Cahyawijaya (75 papers)
Jan Christian Blaise Cruz (20 papers)
Genta Indra Winata (94 papers)
Alham Fikri Aji (94 papers)

Citations (44)

View on Semantic Scholar

Tweets

https://twitter.com/yong_zhengxin/status/1763939066951700692

Multilingual Large Language Models Are Not (Yet) Code-Switchers (2305.14235v2)

Related Papers

Tweets