Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task (2103.01065v1)

Published 1 Mar 2021 in cs.CL

Abstract: In this paper, we tackle the Nuanced Arabic Dialect Identification (NADI) shared task (Abdul-Mageed et al., 2021) and demonstrate state-of-the-art results on all of its four subtasks. Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province. Our final model is an ensemble of variants built on top of MARBERT that achieves an F1-score of 34.03% for DA at the country-level development set -- an improvement of 7.63% from previous work.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Badr AlKhamissi (24 papers)
  2. Mohamed Gabr (5 papers)
  3. Muhammad ElNokrashy (9 papers)
  4. Khaled Essam (1 paper)
Citations (17)

Summary

We haven't generated a summary for this paper yet.