Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Don't Trust ChatGPT when Your Question is not in English: A Study of Multilingual Abilities and Types of LLMs (2305.16339v2)

Published 24 May 2023 in cs.CL and cs.AI

Abstract: LLMs have demonstrated exceptional natural language understanding abilities and have excelled in a variety of NLPtasks in recent years. Despite the fact that most LLMs are trained predominantly in English, multiple studies have demonstrated their comparative performance in many other languages. However, fundamental questions persist regarding how LLMs acquire their multi-lingual abilities and how performance varies across different languages. These inquiries are crucial for the study of LLMs since users and researchers often come from diverse language backgrounds, potentially influencing their utilization and interpretation of LLMs' results. In this work, we propose a systematic way of qualifying the performance disparities of LLMs under multilingual settings. We investigate the phenomenon of across-language generalizations in LLMs, wherein insufficient multi-lingual training data leads to advanced multi-lingual capabilities. To accomplish this, we employ a novel back-translation-based prompting method. The results show that GPT exhibits highly translating-like behaviour in multilingual settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xiang Zhang (395 papers)
  2. Senyu Li (5 papers)
  3. Bradley Hauer (11 papers)
  4. Ning Shi (16 papers)
  5. Grzegorz Kondrak (14 papers)
Citations (52)