Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task (1906.04138v1)

Published 10 Jun 2019 in cs.CL

Abstract: In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data. We divide the shared task into two competitions based on the English-Spanish (ENG-SPA) and Modern Standard Arabic-Egyptian (MSA-EGY) language pairs. We use Twitter data and 9 entity types to establish a new dataset for code-switched NER benchmarks. In addition to the CS phenomenon, the diversity of the entities and the social media challenges make the task considerably hard to process. As a result, the best scores of the competitions are 63.76% and 71.61% for ENG-SPA and MSA-EGY, respectively. We present the scores of 9 participants and discuss the most common challenges among submissions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Gustavo Aguilar (16 papers)
  2. Victor Soto (6 papers)
  3. Mona Diab (71 papers)
  4. Julia Hirschberg (37 papers)
  5. Thamar Solorio (67 papers)
  6. Fahad Alghamdi (7 papers)
Citations (69)

Summary

We haven't generated a summary for this paper yet.