Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NLP for Counterspeech against Hate: A Survey and How-To Guide (2403.20103v1)

Published 29 Mar 2024 in cs.CL

Abstract: In recent years, counterspeech has emerged as one of the most promising strategies to fight online hate. These non-escalatory responses tackle online abuse while preserving the freedom of speech of the users, and can have a tangible impact in reducing online and offline violence. Recently, there has been growing interest from the NLP community in addressing the challenges of analysing, collecting, classifying, and automatically generating counterspeech, to reduce the huge burden of manually producing it. In particular, researchers have taken different directions in addressing these challenges, thus providing a variety of related tasks and resources. In this paper, we provide a guide for doing research on counterspeech, by describing - with detailed examples - the steps to undertake, and providing best practices that can be learnt from the NLP studies on this topic. Finally, we discuss open challenges and future directions of counterspeech research in NLP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (127)
  1. Resources for automated identification of online gender-based violence: A systematic review. In The 7th Workshop on Online Abuse and Harms (WOAH), pages 170–186, Toronto, Canada. Association for Computational Linguistics.
  2. Mining the online infosphere: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(5):e1453.
  3. Distilling implied bias from hate speech for counter narrative selection. In Proceedings of the 1st Workshop on CounterSpeech for Online Abuse (CS4OA), pages 29–43, Prague, Czechia. Association for Computational Linguistics.
  4. Abdullah Albanyan and Eduardo Blanco. 2022. Pinpointing fine-grained relationships between hateful tweets and replies. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10418–10426.
  5. Finding authentic counterhate arguments: A case study with public figures. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13862–13876, Singapore. Association for Computational Linguistics.
  6. Not all counterhate tweets elicit the same replies: A fine-grained analysis. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 71–88, Toronto, Canada. Association for Computational Linguistics.
  7. Where is your evidence: Improving fact-checking by justification modeling. In Proceedings of the first workshop on fact extraction and verification (FEVER), pages 85–90.
  8. Towards countering essentialism through social bias reasoning. In Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI), Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  9. Counter hate speech in social media: A survey.
  10. Counter-argument generation by attacking weak premises. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1816–1827.
  11. Milad Alshomary and Henning Wachsmuth. 2023. Conclusion-based counter-argument generation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 957–967.
  12. N. Asher and A. Lascarides. 2003. Logics of Conversation. Studies in Natural Language Processing. Cambridge University Press.
  13. Mana Ashida and Mamoru Komachi. 2022. Towards automatic generation of messages countering online hate speech and microaggressions. WOAH 2022, page 11.
  14. Imran Awan and Irene Zempi. 2016. The affinity between online and offline anti-muslim hate crime: Dynamics and impacts. Aggression and violent behavior, 27:1–8.
  15. Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In Proceedings of the 13th international workshop on semantic evaluation, pages 54–63.
  16. Anja Belz and Ehud Reiter. 2006. Comparing automatic and human evaluation of NLG systems. In 11th conference of the european chapter of the association for computational linguistics, pages 313–320.
  17. Susan Benesch. 2014. Countering dangerous speech: New ideas for genocide prevention. Washington, DC: United States Holocaust Memorial Museum.
  18. Considerations for successful counterspeech. Dangerous speech project.
  19. Counterspeech on twitter: A field study. Dangerous Speech Project. Available at: https://dangerousspeech.org/counterspeech-on-twitter-a-field- study/.
  20. Cache-based online adaptation for machine translation enhanced computer assisted translation. In MT-Summit, pages 35–42.
  21. Ohchr expert workshops on the prohibition of incitement to national, racial or religious hatred. In Expert workshop on the Americas.
  22. Weigh your own words: Improving hate speech counter narrative generation via attention regularization. In Proceedings of the 1st Workshop on CounterSpeech for Online Abuse (CS4OA), pages 13–28, Prague, Czechia. Association for Computational Linguistics.
  23. Human-machine collaboration approaches to build a dialogue dataset for hate speech countering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8031–8049, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  24. Human-machine collaboration approaches to build a dialogue dataset for hate speech countering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8031–8049.
  25. Catherine Buerger. 2022. Why they do it: Counterspeech theories of change. Available at SSRN 4245211.
  26. Sarah L Carthy and Kiran M Sarma. 2023. Countering terrorist narratives: Assessing the efficacy and mechanisms of change in counter-narrative strategies. Terrorism and Political Violence, 35(3):569–593.
  27. Analyzing zero-shot transfer scenarios across spanish variants for hate speech detection. In Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023), pages 1–13.
  28. The repetition rate of text as a predictor of the effectiveness of machine translation adaptation. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2014), pages 166–179.
  29. Bharathi Raja Chakravarthi. 2020. HopeEDI: A multilingual hope speech detection dataset for equality, diversity, and inclusion. In Proceedings of the Third Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, pages 41–53.
  30. Countering online hate speech: An nlp perspective. arXiv preprint arXiv:2109.02941.
  31. Understanding counterspeech for online harm mitigation. arXiv preprint arXiv:2307.04761.
  32. Multilingual counter narrative type classification. In Proceedings of the 8th Workshop on Argument Mining, pages 125–132.
  33. CONAN - COunter NArratives through nichesourcing: a multilingual dataset of responses to fight online hate speech. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2819–2829, Florence, Italy. Association for Computational Linguistics.
  34. Italian counter narrative generation to fight online hate speech. In Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it.
  35. Towards knowledge-grounded counter narrative generation for hate speech. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 899–914, Online. Association for Computational Linguistics.
  36. Empowering NGOs in countering online hate messages. Online Social Networks and Media, 24:100150.
  37. Agustín Manuel de los Riscos and Luis Fernando D’Haro. 2021. Toxicbot: A conversational agent to fight online hate speech. Conversational dialogue systems for the next decade, pages 15–30.
  38. Mekselina Doğanç and Ilia Markov. 2023. From generic to personalized: Investigating strategies for generating targeted counter narratives against hate speech. In Proceedings of the 1st Workshop on CounterSpeech for Online Abuse (CS4OA), pages 1–12, Prague, Czechia. Association for Computational Linguistics.
  39. Online hate speech victimization: consequences for victims’ feelings of insecurity. Crime Science, 13(1):4.
  40. Human-in-the-loop for data collection: a multi-target counter narrative dataset to fight online hate speech. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3226–3240.
  41. Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Comput. Surv., 51(4).
  42. Understanding and countering stereotypes: A computational approach to the stereotype content model. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 600–616.
  43. Gptscore: Evaluate as you desire. arXiv preprint arXiv:2302.04166.
  44. High-quality argumentative information in low resources approaches improve counter-narrative generation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 2942–2956, Singapore. Association for Computational Linguistics.
  45. Parsimonious argument annotations for hate speech counter-narratives. arXiv e-prints, pages arXiv–2208.
  46. Which argumentative aspects of hate speech in social media can be reliably identified? In Proceedings of Fourth International Workshop on Designing Meaning Representations, co-located with IWCS 2023.
  47. Hope speech detection in spanish: The lgbt case. Language Resources and Evaluation, pages 1–28.
  48. Countering hate on social media: Large scale classification of hate and counter speech. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 102–112.
  49. Counter-TWIT: An Italian corpus for online counterspeech in ecological contexts. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 57–66, Seattle, Washington (Hybrid). Association for Computational Linguistics.
  50. Google Jigsaw. 2022. Perspective API. Accessed: 26 May 2023.
  51. Counterspeeches up my sleeve! intent distribution learning and persistent fusion for intent-conditioned counterspeech generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5792–5809, Toronto, Canada. Association for Computational Linguistics.
  52. Wokegpt: Improving counterspeech generation against online hate speech by intelligently augmenting datasets using a novel metric. In 2023 International Joint Conference on Neural Networks (IJCNN), pages 1–10. IEEE.
  53. Empathy-based counterspeech can reduce racist hate speech in a social media field experiment.
  54. Sabit Hassan and Malihe Alikhani. 2023. Discgen: A framework for discourse-informed counterspeech generation. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 420–429, Nusa Dua, Bali. Association for Computational Linguistics.
  55. Reinforcement learning-based counter-misinformation response generation: a case study of covid-19 vaccine misinformation. In Proceedings of the ACM Web Conference 2023, pages 2698–2709.
  56. A survey on the role of crowds in combating online misinformation: Annotators, evaluators, and creators. arXiv preprint arXiv:2310.02095.
  57. Racism is a virus: Anti-asian hate and counterspeech in social media during the covid-19 crisis. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 90–94.
  58. Argument generation with retrieval, planning, and realization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2661–2672, Florence, Italy. Association for Computational Linguistics.
  59. Neural argument generation augmented with externally retrieved evidence. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 219–230.
  60. Sentence-level content planning and style specification for neural text generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 591–602.
  61. Md Saroar Jahan and Mourad Oussalah. 2023. A systematic review of hate speech automatic detection using natural language processing. Neurocomputing, page 126232.
  62. Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12).
  63. Raucg: Retrieval-augmented unsupervised counter narrative generation for hate speech. arXiv preprint arXiv:2310.05650.
  64. Overview of hope at iberlef 2023: Multilingual hope speech detection. Procesamiento del Lenguaje Natural, 71:371–381.
  65. Generating fluent fact checking explanations with unsupervised post-editing. Information, 13(10):500.
  66. Prosocialdialog: A prosocial backbone for conversational agents. arXiv preprint arXiv:2205.12688.
  67. ProsocialDialog: A prosocial backbone for conversational agents. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4005–4029, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  68. Confronting abusive language online: A survey from the ethical and human rights perspective. Journal of Artificial Intelligence Research, 71:431–478.
  69. Handling and presenting harmful text in NLP research. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 497–510, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  70. Filip Klubicka and Raquel Fernández. 2018. Examining a hate speech corpus for hate speech detection and popularity prediction. In 4REAL 2018 Workshop on Replicability and Reproducibility of Research Results in Science and Technology of Language, page 16.
  71. Neema Kotonya and Francesca Toni. 2020. Explainable automated fact-checking for public health claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7740–7754.
  72. Overview of the shared task on hope speech detection for equality, diversity, and inclusion. In Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion, pages 47–53.
  73. Civil rephrases of toxic texts with self-supervised transformers. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1442–1461, Online. Association for Computational Linguistics.
  74. Construction and evaluation of a user experience questionnaire. In Symposium of the Austrian HCI and Usability Engineering Group, pages 63–76. Springer.
  75. Mario Laurent. 2020. Project hatemeter: helping ngos and social science researchers to analyze and prevent anti-muslim hate speech on social media. Procedia Computer Science, 176:2143–2153.
  76. Elf22: A context-based counter trolling dataset to combat internet trolls. In Proceedings of the Language Resources and Evaluation Conference, pages 3530–3541, Marseille, France. European Language Resources Association.
  77. ELF22: A context-based counter trolling dataset to combat Internet trolls. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3530–3541, Marseille, France. European Language Resources Association.
  78. Crehate: Cross-cultural re-annotation of english hate speech dataset. arXiv preprint arXiv:2308.16705.
  79. Sarah-Jane Leslie. 2014. Carving up the social world with generics. Oxford studies in experimental philosophy, 1.
  80. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
  81. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  82. Paradetox: Detoxification with parallel data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6804–6818.
  83. Characterizing and predicting social correction on twitter. In Proceedings of the 15th ACM Web Science Conference 2023, pages 86–95.
  84. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
  85. Interaction dynamics between hate and counter users on twitter. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, CoDS COMAD 2020, page 116–124, New York, NY, USA. Association for Computing Machinery.
  86. Thou shalt not hate: Countering online hate speech. In Proceedings of the International AAAI Conference on Web and Social Media, volume 13, pages 369–380.
  87. Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. International journal of surgery, 8(5):336–341.
  88. Just collect, don’t filter: Noisy labels do not improve counterspeech collection for languages without annotated resources. In Proceedings of the 1st Workshop on CounterSpeech for Online Abuse (CS4OA), pages 44–61, Prague, Czechia. Association for Computational Linguistics.
  89. Beyond denouncing hate: Strategies for countering implied biases and stereotypes in language. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
  90. Why we need new evaluation metrics for nlg. In 2017 Conference on Empirical Methods in Natural Language Processing, pages 2231–2242. Association for Computational Linguistics.
  91. Hope speech detection: A computational analysis of the voice of peace. arXiv preprint arXiv:1909.12940.
  92. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318. Association for Computational Linguistics.
  93. A benchmark dataset for learning to intervene in online hate speech. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4755–4764, Hong Kong, China. Association for Computational Linguistics.
  94. In-context retrieval-augmented language models. arXiv preprint arXiv:2302.00083.
  95. Cultural transmission of social essentialism. Proceedings of the National Academy of Sciences, 109(34):13526–13531.
  96. Hate and counter-voices in the internet: Introduction to the special issue. SCM Studies in Communication and Media, 7(4):459–472.
  97. HateCheck: Functional tests for hate speech detection models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 41–58, Online. Association for Computational Linguistics.
  98. Countering misinformation via emotional response generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
  99. Benchmarking the Generation of Fact Checking Explanations. Transactions of the Association for Computational Linguistics, 11:1250–1264.
  100. Prevalence and psychological effects of hateful speech in online college communities. In Proceedings of the 10th ACM Conference on Web Science, WebSci ’19, page 255–264, New York, NY, USA. Association for Computing Machinery.
  101. Countergedi: A controllable approach to generate polite, detoxified and emotional counterspeech. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 5157–5163. International Joint Conferences on Artificial Intelligence Organization. AI for Good.
  102. The risk of racial bias in hate speech detection. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 1668–1678.
  103. Carla Schieb and Mike Preuss. 2016. Governing hate speech by means of counterspeech on facebook. In 66th ICA Annual Conference, at Fukuoka, Japan, pages 1–23.
  104. Hinrich Schütze. 2008. Introduction to information retrieval, volume 39. Cambridge: Cambridge University Press.
  105. An analysis of covid-19 related twitter data for asian hate speech using machine learning algorithms. In 2022 1st International Conference on AI in Cybersecurity (ICAIC), pages 1–6. IEEE.
  106. Alexandra A Siegel. 2020. Online hate speech. Social media and democracy: The state of the field, prospects for reform, pages 56–88.
  107. Dominik Stammbach and Elliott Ash. 2020. e-fever: Explanations and summaries for automated fact checking. Proceedings of the 2020 Truth and Trust Online (TTO 2020), pages 32–43.
  108. Using pre-trained language models for producing counter narratives against hate speech: a comparative study. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3099–3114.
  109. Using pre-trained language models for producing counter narratives against hate speech: a comparative study. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3099–3114, Dublin, Ireland. Association for Computational Linguistics.
  110. Generating counter narratives against online hate speech: Data and strategies. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1177–1190, Online. Association for Computational Linguistics.
  111. Fever: a large-scale dataset for fact extraction and verification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 809–819.
  112. Saferdialogues: Taking feedback gracefully after conversational safety failures. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6462–6481.
  113. Automatic counter-narrative generation for hate speech in spanish. Procesamiento del Lenguaje Natural, 71:227–245.
  114. Bertie Vidgen and Leon Derczynski. 2020. Directions in abusive language training data, a systematic review: Garbage in, garbage out. Plos one, 15(12):e0243300.
  115. Detecting east asian prejudice on social media. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 162–172.
  116. Challenges and frontiers in abusive content detection. In Proceedings of the Third Workshop on Abusive Language Online, pages 80–93, Florence, Italy. Association for Computational Linguistics.
  117. Challenges and frontiers in abusive content detection. In Proceedings of the third workshop on abusive language online, pages 80–93.
  118. Introducing cad: the contextual abuse dataset. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2289–2303.
  119. Sharon M. Walter. 1998. Book reviews: Evaluating natural language processing systems: An analysis and review. Computational Linguistics, 24(2).
  120. Sentigan: Generating sentimental texts via mixture adversarial networks. In IJCAI, pages 4446–4452.
  121. Understanding abuse: A typology of abusive language detection subtasks. arXiv preprint arXiv:1705.09899.
  122. Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, pages 88–93.
  123. Bot-adversarial dialogue for safe conversational agents. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2950–2968.
  124. Hate speech and counter speech detection: Conversational context does matter. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5918–5930, Seattle, United States. Association for Computational Linguistics.
  125. Bertscore: Evaluating text generation with BERT. arXiv preprint arXiv:1904.09675.
  126. What makes good counterspeech? A comparison of generation approaches and evaluation metrics. In Proceedings of the 1st Workshop on CounterSpeech for Online Abuse (CS4OA), pages 62–71, Prague, Czechia. Association for Computational Linguistics.
  127. Wanzheng Zhu and Suma Bhat. 2021. Generate, prune, select: A pipeline for counterspeech generation against online hate speech. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 134–149, Online. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Helena Bonaldi (6 papers)
  2. Yi-Ling Chung (12 papers)
  3. Gavin Abercrombie (17 papers)
  4. Marco Guerini (40 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.