Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification (2005.03695v2)

Published 7 May 2020 in cs.CL

Abstract: This paper presents our system entitled `LIIR' for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2). We have participated in sub-task A for English, Danish, Greek, Arabic, and Turkish languages. We adapt and fine-tune the BERT and Multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages we propose a cross-lingual augmentation approach in order to enrich training data and we use Multilingual BERT to obtain sentence representations. LIIR achieved rank 14/38, 18/47, 24/86, 24/54, and 25/40 in Greek, Turkish, English, Arabic, and Danish languages, respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Erfan Ghadery (5 papers)
  2. Marie-Francine Moens (102 papers)
Citations (11)

Summary

We haven't generated a summary for this paper yet.