Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HateMonitors: Language Agnostic Abuse Detection in Social Media (1909.12642v1)

Published 27 Sep 2019 in cs.SI and cs.CL

Abstract: Reducing hateful and offensive content in online social media pose a dual problem for the moderators. On the one hand, rigid censorship on social media cannot be imposed. On the other, the free flow of such content cannot be allowed. Hence, we require efficient abusive language detection system to detect such harmful content in social media. In this paper, we present our machine learning model, HateMonitor, developed for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC), a shared task at FIRE 2019. We have used a Gradient Boosting model, along with BERT and LASER embeddings, to make the system language agnostic. Our model came at First position for the German sub-task A. We have also made our model public at https://github.com/punyajoy/HateMonitors-HASOC .

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Punyajoy Saha (27 papers)
  2. Binny Mathew (24 papers)
  3. Pawan Goyal (170 papers)
  4. Animesh Mukherjee (154 papers)
Citations (28)
Github Logo Streamline Icon: https://streamlinehq.com