HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis (2304.13634v1)
Abstract: We present the findings of SemEval-2023 Task 12, a shared task on sentiment analysis for low-resource African languages using Twitter dataset. The task featured three subtasks; subtask A is monolingual sentiment classification with 12 tracks which are all monolingual languages, subtask B is multilingual sentiment classification using the tracks in subtask A and subtask C is a zero-shot sentiment classification. We present the results and findings of subtask A, subtask B and subtask C. We also release the code on github. Our goal is to leverage low-resource tweet data using pre-trained Afro-xlmr-large, AfriBERTa-Large, Bert-base-arabic-camelbert-da-sentiment (Arabic-camelbert), Multilingual-BERT (mBERT) and BERT models for sentiment analysis of 14 African languages. The datasets for these subtasks consists of a gold standard multi-class labeled Twitter datasets from these languages. Our results demonstrate that Afro-xlmr-large model performed better compared to the other models in most of the languages datasets. Similarly, Nigerian languages: Hausa, Igbo, and Yoruba achieved better performance compared to other languages and this can be attributed to the higher volume of data present in the languages.
- Saheed Abdullahi Salahudeen (3 papers)
- Falalu Ibrahim Lawan (7 papers)
- Ahmad Mustapha Wali (3 papers)
- Amina Abubakar Imam (1 paper)
- Aliyu Rabiu Shuaibu (1 paper)
- Aliyu Yusuf (3 papers)
- Nur Bala Rabiu (1 paper)
- Musa Bello (2 papers)
- Shamsuddeen Umaru Adamu (1 paper)
- Saminu Mohammad Aliyu (7 papers)
- Murja Sani Gadanya (2 papers)
- Sanah Abdullahi Muaz (1 paper)
- Mahmoud Said Ahmad (1 paper)
- Abdulkadir Abdullahi (2 papers)
- Abdulmalik Yusuf Jamoh (1 paper)