2000 character limit reached
ANDES at SemEval-2020 Task 12: A jointly-trained BERT multilingual model for offensive language detection (2008.06408v1)
Published 13 Aug 2020 in cs.CL
Abstract: This paper describes our participation in SemEval-2020 Task 12: Multilingual Offensive Language Detection. We jointly-trained a single model by fine-tuning Multilingual BERT to tackle the task across all the proposed languages: English, Danish, Turkish, Greek and Arabic. Our single model had competitive results, with a performance close to top-performing systems in spite of sharing the same parameters across all languages. Zero-shot and few-shot experiments were also conducted to analyze the transference performance among these languages. We make our code public for further research
- Juan Manuel Pérez (10 papers)
- Aymé Arango (2 papers)
- Franco Luque (4 papers)