2000 character limit reached
Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach (1710.07394v2)
Published 20 Oct 2017 in cs.CL
Abstract: In the wake of a polarizing election, social media is laden with hateful content. To address various limitations of supervised hate speech classification methods including corpus bias and huge cost of annotation, we propose a weakly supervised two-path bootstrapping approach for an online hate speech detection model leveraging large-scale unlabeled data. This system significantly outperforms hate speech detection systems that are trained in a supervised manner using manually annotated data. Applying this model on a large quantity of tweets collected before, after, and on election day reveals motivations and patterns of inflammatory language.
- Lei Gao (102 papers)
- Alexis Kuppersmith (1 paper)
- Ruihong Huang (41 papers)