UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection (2004.11493v2)

Published 23 Apr 2020 in cs.CL

Abstract: Fine-tuning of pre-trained transformer networks such as BERT yield state-of-the-art results for text classification tasks. Typically, fine-tuning is performed on task-specific training datasets in a supervised manner. One can also fine-tune in unsupervised manner beforehand by further pre-training the masked LLMing (MLM) task. Hereby, in-domain data for unsupervised MLM resembling the actual classification target dataset allows for domain adaptation of the model. In this paper, we compare current pre-trained transformer networks with and without MLM fine-tuning on their performance for offensive language detection. Our MLM fine-tuned RoBERTa-based classifier officially ranks 1st in the SemEval 2020 Shared Task~12 for the English language. Further experiments with the ALBERT model even surpass this result.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Gregor Wiedemann (16 papers)
Seid Muhie Yimam (41 papers)
Chris Biemann (78 papers)

Citations (28)

View on Semantic Scholar

UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection (2004.11493v2)

Related Papers