Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How toxic is antisemitism? Potentials and limitations of automated toxicity scoring for antisemitic online content (2310.04465v1)

Published 5 Oct 2023 in cs.CL, cs.AI, and cs.CY

Abstract: The Perspective API, a popular text toxicity assessment service by Google and Jigsaw, has found wide adoption in several application areas, notably content moderation, monitoring, and social media research. We examine its potentials and limitations for the detection of antisemitic online content that, by definition, falls under the toxicity umbrella term. Using a manually annotated German-language dataset comprising around 3,600 posts from Telegram and Twitter, we explore as how toxic antisemitic texts are rated and how the toxicity scores differ regarding different subforms of antisemitism and the stance expressed in the texts. We show that, on a basic level, Perspective API recognizes antisemitic content as toxic, but shows critical weaknesses with respect to non-explicit forms of antisemitism and texts taking a critical stance towards it. Furthermore, using simple text manipulations, we demonstrate that the use of widespread antisemitic codes can substantially reduce API scores, making it rather easy to bypass content moderation based on the service's results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification. In Companion Proceedings of The 2019 World Wide Web Conference, pages 491–500. ACM.
  2. Antisemitism in Social Media. Conspiracies, Stereotypes, and Holocaust Denial. Technical Report FOI-R–5198–SE, Sweden.
  3. Automated Hate Speech Detection and the Problem of Offensive Language. Proceedings of the International AAAI Conference on Web and Social Media, 11(1):512–515.
  4. Measuring and Mitigating Unintended Bias in Text Classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 67–73. ACM.
  5. Antisemitic Disinformation: A Study of the Online Dissemination of Anti-Jewish Conspiracy Theories. Technical report, The Network Contagion Research Institute.
  6. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3356–3369. Association for Computational Linguistics.
  7. Felipe González-Pizarro and Savvas Zannettou. 2022. Understanding and Detecting Hateful Content using Contrastive Learning. Preprint. ArXiv: 2201.08387.
  8. Google. 2022a. About the API - Attributes and Languages.
  9. Google. 2022b. About the API - Score.
  10. Thomas Haury. 2002. Antisemitismus von links: kommunistische Ideologie, Nationalismus und Antizionismus in der frühen DDR, 1 edition. Hamburger Ed, Hamburg.
  11. Do Platform Migrations Compromise Content Moderation? Evidence from r/The_donald and r/Incels. In Proceedings of the ACM on Human-Computer Interaction (CSCW2), pages 1–24.
  12. On the Globalization of the QAnon Conspiracy Theory Through Telegram. Preprint. ArXiv: 2105.13020.
  13. Deceiving Google’s Perspective API Built for Detecting Toxic Comments. Preprint. ArXiv: 1702.08138.
  14. Social Biases in NLP Models as Barriers for Persons with Disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5491–5501, Online. Association for Computational Linguistics.
  15. International Holocaust Remembrance Alliance. 2016. The working definition of antisemitism.
  16. Jigsaw. 2021. Unintended Bias and Identity Terms.
  17. Annotating Antisemitic Online Content. Towards an Applicable Definition of Antisemitism. Preprint. ArXiv: 1910.01214.
  18. Toward an AI Definition of Antisemitism? In Monika Hübscher and Sabine Von Mering, editors, Antisemitism on Social Media, 1 edition, pages 193–212. Routledge.
  19. Antisemitism on Twitter: Collective Efficacy and the Role of Community Organisations in Challenging Online Hate Speech. Social Media + Society, 6(2).
  20. Quick, Community-Specific Learning: How Distinctive Toxicity Norms Are Maintained in Political Subreddits. In Proceedings of the International AAAI Conference on Web and Social Media, volume 14, pages 557–568.
  21. HateCheck: Functional Tests for Hate Speech Detection Models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 41–58. ArXiv: 2012.15606.
  22. Ben Sales. 2021. Are social media platforms banning holocaust education along with hate speech? The Times of Israel. (Accessed on 08/22/2022).
  23. The Risk of Racial Bias in Hate Speech Detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1668–1678, Florence, Italy. Association for Computational Linguistics.
  24. Codes, patterns and shapes of contemporary online antisemitism and conspiracy narratives - an annotation guide and labeled German-language dataset in the context of COVID-19. Preprint.
  25. Wikipedia Talk Labels: Toxicity. figshare. Dataset.
  26. Gabriel Professor Weimann and Ari Ben Am. 2020. Digital Dog Whistles: The New Online Language of Extremism. International Journal of Security Studies (Vol. 2 : Iss. 1 , Article 4), page 24.
  27. Ex Machina: Personal Attacks Seen at Scale. In Proceedings of the 26th International Conference on World Wide Web, pages 1391–1399. International World Wide Web Conferences Steering Committee.
  28. Measuring and Characterizing Hate Speech on News Websites. In 12th ACM Conference on Web Science, pages 125–134, Southampton United Kingdom. ACM.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Helena Mihaljević (11 papers)
  2. Elisabeth Steffen (4 papers)
Citations (1)