Analyzing the Targets of Hate in Online Social Media (1603.07709v1)

Published 24 Mar 2016 in cs.SI

Abstract: Social media systems allow Internet users a congenial platform to freely express their thoughts and opinions. Although this property represents incredible and unique communication opportunities, it also brings along important challenges. Online hate speech is an archetypal example of such challenges. Despite its magnitude and scale, there is a significant gap in understanding the nature of hate speech on social media. In this paper, we provide the first of a kind systematic large scale measurement study of the main targets of hate speech in online social media. To do that, we gather traces from two social media systems: Whisper and Twitter. We then develop and validate a methodology to identify hate speech on both these systems. Our results identify online hate speech forms and offer a broader understanding of the phenomenon, providing directions for prevention and detection approaches.

Authors (5)

Leandro Silva (3 papers)
Mainack Mondal (17 papers)
Denzil Correa (4 papers)
Ingmar Weber (66 papers)
Fabricio Benevenuto (14 papers)

Citations (327)

View on Semantic Scholar

Summary

An Analysis of Hate Speech Targets in Online Social Media

The paper, "Analyzing the Targets of Hate in Online Social Media," presents a foundational and systematic investigation into the nature and targets of hate speech within contemporary social media environments, specifically Whisper and Twitter. This paper contributes to the discourse by systematically identifying the primary targets of online hate speech and crafting methodologies to detect such content across these platforms. These endeavors are essential, given the profound communication transformations enabled by social media, coupled with the challenges posed by online expressions of hate.

Methodological Approach

The authors have employed an innovative approach by harvesting data from the Whisper and Twitter platforms over a year-long period, from June 2014 to June 2015. The data corpus consists of 48.97 million whispers and 1.6 billion tweets. By focusing on English-language posts furnished with location data where applicable, the final datasets were refined to contain 27.55 million whispers and 512 million tweets.

To identify hate speech, the paper does not rely on conventional, predefined lists of hate words but introduces a novel sentence structure analysis. The cornerstone of this method is an expression pattern that captures the intensity, intent, and target of hate messages, ensuring high precision in detecting hate speech:

I <intensity> <user intent> <hate target>

This pattern was supplemented by leveraging Hatebase, an expansive repository of hate speech data, to refine the detection of hate expressions and targets. Employing these methodologies yielded 20,305 tweets and 7,604 whispers discerned as containing hate speech. The reliability of this detection mechanism was tested and affirmed with a precision evaluation, confirming that the posts indeed aligned with human judgments of hate speech.

Results and Observations

The paper presents several significant findings about hate speech on social media:

Prevalent Hate Categories: The research delineates nine primary categories of hate speech targets, including race, behavior, and physical appearance, which collectively constitute the majority of online hate across both platforms. Specifically, racial epithets were the predominant form of hate terms on Twitter.
Platform-Specific Dynamics: The paper noted variations between the platforms, with behavior-related hate more pronounced on Whisper, likely due to Twitter's non-anonymous nature mitigating some of the more explicit expressions of hate.
Societal Implications: The findings highlight that hate speech in online contexts often extends beyond established hate crime categories, encompassing non-traditional targets based on physical attributes or behaviors.

Implications and Future Directions

The implications of this research are substantial, as it advances our understanding of the landscape of online hate. Practically, the methodologies developed could inform the design of more effective monitoring tools and automated detection systems tasked with identifying and mitigating hate speech across social platforms. Theoretically, the work challenges prevailing narratives surrounding the origins and vectors of hate speech, urging a reconsideration of both its definition and the scope of its impact.

Future research directions could expand on this foundation by integrating more nuanced models that incorporate machine learning to further refine detection strategies. Additionally, cross-cultural studies emphasizing the variance in hate speech patterns among different linguistic and cultural contexts could provide deeper insights. Evolving social norms and legal frameworks may also necessitate ongoing adjustments and recalibrations in detection technologies to keep pace with shifting boundaries of acceptable speech.

Collectively, this paper represents a pivotal step toward comprehending and counteracting the proliferation of hate in digital discourse, with significant implications for policy-making, community standards, and social media governance.

PDF Markdown

Related Papers

Find Related Papers