Detecting Offensive Language in Tweets Using Deep Learning (1801.04433v1)

Published 13 Jan 2018 in cs.CL, cs.CY, and cs.SI

Abstract: This paper addresses the important problem of discerning hateful content in social media. We propose a detection scheme that is an ensemble of Recurrent Neural Network (RNN) classifiers, and it incorporates various features associated with user-related information, such as the users' tendency towards racism or sexism. These data are fed as input to the above classifiers along with the word frequency vectors derived from the textual content. Our approach has been evaluated on a publicly available corpus of 16k tweets, and the results demonstrate its effectiveness in comparison to existing state of the art solutions. More specifically, our scheme can successfully distinguish racism and sexism messages from normal text, and achieve higher classification quality than current state-of-the-art algorithms.

Citations (214)

View on Semantic Scholar

Summary

The paper introduces an ensemble of LSTM models that combine linguistic content with user behavior data for improved offensive content detection.
Experimental results show a high F-score of 0.9319 and significant improvement over traditional text-only methods.
The research provides a scalable, real-time solution for social media moderation with potential for multilingual and diverse dataset applications.

A Comprehensive Examination of Offensive Language Detection in Social Media via Deep Learning Techniques

The paper "Detecting Offensive Language in Tweets Using Deep Learning" authored by Georgios K. Pitsilis, Heri Ramampiaro, and Helge Langseth, addresses the classification of offensive content within social media, specifically focusing on tweets. Employing an ensemble of Recurrent Neural Network (RNN) classifiers, the researchers integrate user-associated behavioral data to enhance detection precision, navigating beyond conventional text-only methodologies.

Methodology

The primary focus is the utilization of Long Short-Term Memory (LSTM) networks, a subset of RNNs, capable of using past information to identify harmful content accurately. These networks form a core ensemble system designed to process linguistic content alongside user behavioral characteristics. The architecture incorporates user tendencies towards producing racist or sexist content, inferred from historical posting patterns. It diverges from traditional reliance on pre-trained word embeddings like Glove or Word2Vec, opting instead for text vectorization based on word frequency, thereby enhancing language agnosticism and obviating the constraints associated with conventional NLP solutions.

The paper further exalts the incorporation of multiple classifier outputs. The ensemble method blends classifications utilizing both majority voting and confidence-based decision mechanisms, underpinning the ensemble’s aggregated decisions with a structured mechanism detailed in their proposed algorithm.

Datasets and Experimental Evaluation

The researchers validate their approach using a dataset of 16,000 tweets, leveraging existing annotations of racial, sexist, or neutral content provided by prior work of Waseem and Hovy (2016). A key element of the paper is the inclusion of user behavioral features — a novel aspect within the deep learning context for hate-speech detection, as noted by the correlation coefficients indicating user tendencies: 0.71 for racism and 0.76 for sexism.

Numerous configurations of the proposed model were tested, accounting for various arrangements of classifiers and behavioral feature pairings. The experimental design encompasses 10-fold cross-validation, ensuring robust performance evaluation measured primarily via Precision, Recall, and F-Score.

Results and Analysis

Empirical results signify a marked enhancement in classification performance through ensemble configurations, achieving notably high F-Scores such as 0.9319 with ensemble models. Standout results derive from models incorporating user behaviors as features, which demonstrably surpass baselines reliant solely on linguistic content. Notably, this advancement is observable via comparative analyses against state-of-the-art approaches, including those utilizing neural networks and NLP techniques.

Despite clear successes in the holistic ensemble method, challenges exist in exact class determination, such as distinguishing between types of offensive content. The work establishes a framework for future improvements and potential explorations using different datasets and language paradigms.

Implications and Future Directions

This paper contributes both theoretically and practically to the evolving landscape of automated hate-speech detection, emphasizing that deep learning models, particularly those incorporating nuanced user behavioral data, can enhance detection capabilities over simple text-based approaches. Practically, this research offers a scalable tool for real-time content moderation across social media platforms, addressing a critical need in digital communication landscapes.

The potential expansions of this work involve exploring alternate feature sets derived from social and behavioral analytics, as well as investigating model applicability and performance across varied and multilingual datasets. Such directions promise advancements in the accuracy and resilience of automatic content moderation systems, adapting to the dynamic challenges posed by continuously evolving social media environments. This focus on model adaptability and robustness signifies a critical trajectory for subsequent research and application in the area of AI-assisted sentiment and hate detection frameworks.

PDF Markdown