Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages (2302.08956v5)

Published 17 Feb 2023 in cs.CL

Abstract: Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents. These include 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yor`ub\'a) from four language families. The tweets were annotated by native speakers and used in the AfriSenti-SemEval shared task (The AfriSenti Shared Task had over 200 participants. See website at https://afrisenti-semeval.github.io). We describe the data collection methodology, annotation process, and the challenges we dealt with when curating each dataset. We further report baseline experiments conducted on the different datasets and discuss their usefulness.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (26)
  1. Shamsuddeen Hassan Muhammad (42 papers)
  2. Idris Abdulmumin (39 papers)
  3. Abinew Ali Ayele (17 papers)
  4. Nedjma Ousidhoum (17 papers)
  5. David Ifeoluwa Adelani (59 papers)
  6. Seid Muhie Yimam (41 papers)
  7. Ibrahim Sa'id Ahmad (3 papers)
  8. Meriem Beloucif (11 papers)
  9. Saif M. Mohammad (70 papers)
  10. Sebastian Ruder (93 papers)
  11. Oumaima Hourrane (6 papers)
  12. Pavel Brazdil (6 papers)
  13. Felermino Dário Mário António Ali (1 paper)
  14. Davis David (7 papers)
  15. Salomey Osei (21 papers)
  16. Bello Shehu Bello (8 papers)
  17. Falalu Ibrahim (1 paper)
  18. Tajuddeen Gwadabe (8 papers)
  19. Samuel Rutunda (4 papers)
  20. Tadesse Belay (1 paper)
Citations (68)