Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Characterizing (Un)moderated Textual Data in Social Systems (2101.00963v1)

Published 4 Jan 2021 in cs.SI

Abstract: Despite the valuable social interactions that online media promote, these systems provide space for speech that would be potentially detrimental to different groups of people. The moderation of content imposed by many social media has motivated the emergence of a new social system for free speech named Gab, which lacks moderation of content. This article characterizes and compares moderated textual data from Twitter with a set of unmoderated data from Gab. In particular, we analyze distinguishing characteristics of moderated and unmoderated content in terms of linguistic features, evaluate hate speech and its different forms in both environments. Our work shows that unmoderated content presents different psycholinguistic features, more negative sentiment and higher toxicity. Our findings support that unmoderated environments may have proportionally more online hate speech. We hope our analysis and findings contribute to the debate about hate speech and benefit systems aiming at deploying hate speech detection approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Fabricio Benevenuto (14 papers)
  2. Lucas Henrique Costa de Lima (1 paper)
  3. Julio Reis (2 papers)
  4. Philipe Melo (8 papers)
  5. Fabricio Murai (29 papers)
Citations (5)