Papers
Topics
Authors
Recent
Search
2000 character limit reached

Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models

Published 30 Jan 2024 in cs.CL | (2401.16727v4)

Abstract: In the evolving landscape of online communication, moderating hate speech (HS) presents an intricate challenge, compounded by the multimodal nature of digital content. This comprehensive survey delves into the recent strides in HS moderation, spotlighting the burgeoning role of LLMs and large multimodal models (LMMs). Our exploration begins with a thorough analysis of current literature, revealing the nuanced interplay between textual, visual, and auditory elements in propagating HS. We uncover a notable trend towards integrating these modalities, primarily due to the complexity and subtlety with which HS is disseminated. A significant emphasis is placed on the advances facilitated by LLMs and LMMs, which have begun to redefine the boundaries of detection and moderation capabilities. We identify existing gaps in research, particularly in the context of underrepresented languages and cultures, and the need for solutions to handle low-resource settings. The survey concludes with a forward-looking perspective, outlining potential avenues for future research, including the exploration of novel AI methodologies, the ethical governance of AI in moderation, and the development of more nuanced, context-aware systems. This comprehensive overview aims to catalyze further research and foster a collaborative effort towards more sophisticated, responsible, and human-centric approaches to HS moderation in the digital era. WARNING: This paper contains offensive examples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. OTTO SANTA ANA. ‘like an animal i was treated’: Anti-immigrant metaphor in us public discourse. Discourse & Society, 1999.
  2. Angrybert: Joint learning target and emotion for hate speech detection. In PAKDD, 2021.
  3. The affinity between online and offline anti-muslim hate crime: Dynamics and impacts. Aggression and violent behavior, 2016.
  4. Design and implementation of fast spoken foul language recognition with different end-to-end deep neural network architectures. Sensors, 2021.
  5. Necessity and sufficiency for explaining text classifiers: A case study in hate speech detection. arXiv preprint arXiv:2205.03302, 2022.
  6. Detecting offensive user video blogs: An adaptive keyword spotting approach. In ICALIP, 2012.
  7. Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In SemEval, 2019.
  8. Crisishatemm: Multimodal analysis of directed and undirected hate speech in text-embedded images from russia-ukraine conflict. In CVPR Workshops. IEEE, 2023.
  9. Multi-modal hate speech detection using machine learning. In Big Data. IEEE, 2021.
  10. Multi-modal hate speech detection using machine learning. In ICBD. IEEE, 2021.
  11. Rui Cao and Roy Ka-Wei Lee. Hategan: Adversarial generative-based data augmentation for hate speech detection. In COLING, 2020.
  12. Deephate: Hate speech detection via multi-faceted text representations. In WebSci, 2020.
  13. Prompting for multimodal hateful meme classification. In EMNLP, 2022.
  14. Pro-cap: Leveraging a frozen vision-language model for hateful meme detection. In ACMMM, 2023.
  15. Camilla Casula. Transfer learning for multilingual offensive language detection with bert, 2020.
  16. A literature survey on multimodal and multilingual automatic hate speech identification. Multimedia Systems, 2023.
  17. Towards knowledge-grounded counter narrative generation for hate speech. arXiv preprint arXiv:2106.11783, 2021.
  18. Mutox: Universal multilingual audio-based toxicity dataset and zero-shot detector. arXiv preprint arXiv:2401.05060, 2024.
  19. Hatemm: A multi-modal dataset for hate video classification. In ICWSM, 2023.
  20. Automated hate speech detection and the problem of offensive language. In ICWSM, 2017.
  21. Hate speech dataset from a white supremacy forum. In Darja Fišer, Ruihong Huang, Vinodkumar Prabhakaran, Rob Voigt, Zeerak Waseem, and Jacqueline Wernimont, editors, Proc. of the 2st workshop on ab. lang. online, 2018.
  22. Self-reflective and introspective feature model for hate content detection in sinhala youtube videos. In From Innovation to Impact (FITI), 2020.
  23. Gendered hate speech in youtube and younow comments: Results of two content analyses. SCM Studies in Comm. and Media, 2020.
  24. Latent hatred: A benchmark for understanding implicit hate speech. arXiv preprint arXiv:2109.05322, 2021.
  25. Human-in-the-loop for data collection: a multi-target counter narrative dataset to fight online hate speech. arXiv preprint arXiv:2107.08720, 2021.
  26. Overview of the task on automatic misogyny identification at ibereval 2018. Ibereval@ sepln, 2018.
  27. Semeval-2022 task 5: Multimedia automatic misogyny identification. In SemEval@NAACL, 2022.
  28. Large scale crowdsourcing and characterization of twitter abusive behavior. In ICWSM, 2018.
  29. Benchmark dataset of memes with text transcriptions for automatic detection of multi-modal misogynistic content. Data in brief, 2022.
  30. Detoxy: A large-scale multimodal dataset for toxicity classification in spoken utterances. arXiv preprint arXiv:2110.07592, 2021.
  31. Exploring hate speech detection in multimodal publications. In IEEE/WACV, 2020.
  32. Empathy-based counterspeech can reduce racist hate speech in a social media field experiment. Proc. of the Nat. Acad. of Sci., 2021.
  33. Toxigen: A large-scale machine-generated dataset for adversarial and implicit hate speech detection. arXiv preprint arXiv:2203.09509, 2022.
  34. Decoding the underlying meaning of multimodal hateful memes. In IJCAI, 2023.
  35. The language of extremism on social media: An examination of posts, comments, and themes on reddit. Frontiers in Pol. Sci., 2022.
  36. Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. arXiv preprint arXiv:2212.10403, 2022.
  37. Audio-based hate speech classification from online short-form videos. In IALP, 2021.
  38. Survey of hallucination in natural language generation. ACM Computing Surveys, 2023.
  39. The gab hate corpus: A collection of 27k posts annotated for hate speech. PsyArXiv. July, 2018.
  40. Contextualizing hate speech classifiers with post-hoc explanation. In ACL, 2020.
  41. The hateful memes challenge: Detecting hate speech in multimodal memes. In NeurIPS, 2020.
  42. Generalizable implicit hate speech detection using contrastive learning. In Proc. of the 29th Int. Conf. on Comp. Ling., 2022.
  43. Hate-clipper: Multimodal hateful meme classification based on cross-modal interaction of CLIP features. CoRR, 2022.
  44. Disentangling hate in online memes. In ACMMM, 2021.
  45. Beneath the surface: Unveiling harmful memes with multimodal reasoning distilled from large language models. In EMNLP (Findings), 2023.
  46. A multimodal framework for the detection of hateful memes. CoRR, 2020.
  47. Improved baselines with visual instruction tuning. arXiv preprint arXiv:2310.03744, 2023.
  48. Conditional adversarial domain adaptation. Adv. in neur. inf. proc. sys., 2018.
  49. Offline events and online hate. PLoS one, 2023.
  50. You know what to do proactive detection of youtube videos targeted by coordinated hate attacks. Proc. of the ACM on HCI, 2019.
  51. Proactively reducing the hate intensity of online posts via hate speech normalization. In ACM-SIGKDD, 2022.
  52. Thou shalt not hate: Countering online hate speech. In ICWSM, 2019.
  53. Hatexplain: A benchmark dataset for explainable hate speech detection. In AAAI, 2021.
  54. Findings of the WOAH 5 shared task on fine grained hateful memes detection. In WOAH, 2021.
  55. Audio-based hate speech detection for the metaverse using cnn. In KICS, 2022.
  56. Laura Beth Nielsen. Subtle, pervasive, harmful: Racist and sexist remarks in public as hate speech. Journal of Social issues, 2002.
  57. Identifying toxicity within youtube video comment. In Soc., Cul., and Behavioral Modeling: 12th Int. Conf., SBP-BRiMS, 2019.
  58. Playing the part of the sharp bully: Generating adversarial examples for implicit hate speech detection. In ACL (Findings), 2023.
  59. An in-depth analysis of implicit and subtle hate speech messages. In Andreas Vlachos and Isabelle Augenstein, editors, ACL, 2023.
  60. Momenta: A multimodal framework for detecting harmful memes and their targets. arXiv preprint arXiv:2109.05184, 2021.
  61. Assessing the extent and types of hate speech in fringe communities: A case study of alt-right communities on 8chan, 4chan, and reddit. Social Media + Society, 2021.
  62. Systematic literature review of hate speech detection with text mining. In ICORIS. IEEE, 2020.
  63. Recognizing misogynous memes: Biased models and tricky archetypes. Inf. Proc. Mgmt., 2023.
  64. Vlad Sandulescu. Detecting hateful memes using a multimodal deep ensemble. CoRR, 2020.
  65. The risk of racial bias in hate speech detection. In ACL, 2019.
  66. Social bias frames: Reasoning about social and power implications of language. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, ACL, 2020.
  67. Ursula Kristin Schmid. Humorous hate speech on social media: A mixed-methods investigation of users’ perceptions and processing of hateful memes. New Media & Society, 2023.
  68. See no evil, hear no evil: Audio-visual-textual cyberbullying detection. Proc. ACM Hum.-Comput. Interact., 2018.
  69. Explaining toxic text via knowledge enhanced text generation. In NAACL, 2022.
  70. A survey on hate speech detection and sentiment analysis using machine learning and deep learning models. Alexandria Engineering Journal, 2023.
  71. A multi-modal dataset for hate speech detection on social media: Case-study of russia-ukraine conflict. In CASE@EMNLP, 2022.
  72. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  73. Detecting and correcting hate speech in multimodal memes with large visual language model. CoRR, 2023.
  74. Detecting hate speech in memes using multimodal deep learning approaches: Prize-winning solution to hateful memes challenge. CoRR, 2020.
  75. Learning from the worst: Dynamically generated datasets to improve online hate detection. In ACL-IJCNLP, 2021.
  76. Evaluating GPT-3 generated explanations for hateful content moderation. In IJCAI, 2023.
  77. Large language models are latent variable models: Explaining and finding good demonstrations for in-context learning. In NeurIPS, 2023.
  78. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proc. of the NAACL SRW, 2016.
  79. Spectrogram-based classification of spoken foul language using deep cnn. In MMSP, 2020.
  80. Hate in word and deed: the temporal ass. between online and offline islamophobia. Journal of quantitative criminology, 2023.
  81. Detection of hate speech in videos using machine learning. In Int. Conf. on Comp. Science and Comp. Int. IEEE, 2020.
  82. Multimodal hate speech detection via cross-domain knowledge transfer. In ACMMM, 2022.
  83. Hare: Explainable hate speech detection with step-by-step reasoning. arXiv preprint arXiv:2311.00321, 2023.
  84. The dawn of lmms: Preliminary explorations with gpt-4v (ision). arXiv preprint arXiv:2309.17421, 2023.
  85. Audio-based toxic language classification using self-attentive convolutional neural network. In EUSIPCO, 2021.
  86. Generate, prune, select: A pipeline for counterspeech generation against online hate speech. 2021.
Citations (7)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 3 likes about this paper.