Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NaijaNER : Comprehensive Named Entity Recognition for 5 Nigerian Languages (2105.00810v1)

Published 30 Mar 2021 in cs.CL

Abstract: Most of the common applications of Named Entity Recognition (NER) is on English and other highly available languages. In this work, we present our findings on Named Entity Recognition for 5 Nigerian Languages (Nigerian English, Nigerian Pidgin English, Igbo, Yoruba and Hausa). These languages are considered low-resourced, and very little openly available Natural Language Processing work has been done in most of them. In this work, individual NER models were trained and metrics recorded for each of the languages. We also worked on a combined model that can handle Named Entity Recognition (NER) for any of the five languages. The combined model works well for Named Entity Recognition(NER) on each of the languages and with better performance compared to individual NER models trained specifically on annotated data for the specific language. The aim of this work is to share our learning on how information extraction using Named Entity Recognition can be optimized for the listed Nigerian Languages for inclusion, ease of deployment in production and reusability of models. Models developed during this project are available on GitHub https://git.io/JY0kk and an interactive web app https://nigner.herokuapp.com/.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Wuraola Fisayo Oyewusi (3 papers)
  2. Olubayo Adekanmbi (5 papers)
  3. Ifeoma Okoh (4 papers)
  4. Vitus Onuigwe (1 paper)
  5. Mary Idera Salami (1 paper)
  6. Opeyemi Osakuade (3 papers)
  7. Sharon Ibejih (1 paper)
  8. Usman Abdullahi Musa (1 paper)
Citations (8)