IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models (2406.03368v1)
Abstract: Despite the widespread adoption of LLMs, their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based QA~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and four proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Aya-101 only at 58\% of the best-performing proprietary model GPT-4o performance. Machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, like LLaMa 3 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.
- David Ifeoluwa Adelani (59 papers)
- Jessica Ojo (6 papers)
- Israel Abebe Azime (16 papers)
- Jian Yun Zhuang (2 papers)
- Jesujoba O. Alabi (20 papers)
- Xuanli He (43 papers)
- Millicent Ochieng (8 papers)
- Sara Hooker (71 papers)
- Andiswa Bukula (8 papers)
- En-Shiun Annie Lee (17 papers)
- Chiamaka Chukwuneke (8 papers)
- Happy Buzaaba (9 papers)
- Blessing Sibanda (8 papers)
- Godson Kalipe (4 papers)
- Jonathan Mukiibi (10 papers)
- Salomon Kabongo (10 papers)
- Foutse Yuehgoh (5 papers)
- Mmasibidi Setaka (2 papers)
- Lolwethu Ndolela (4 papers)
- Nkiruka Odu (3 papers)