Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ReZG: Retrieval-Augmented Zero-Shot Counter Narrative Generation for Hate Speech (2310.05650v2)

Published 9 Oct 2023 in cs.CL

Abstract: The proliferation of hate speech (HS) on social media poses a serious threat to societal security. Automatic counter narrative (CN) generation, as an active strategy for HS intervention, has garnered increasing attention in recent years. Existing methods for automatically generating CNs mainly rely on re-training or fine-tuning pre-trained LLMs (PLMs) on human-curated CN corpora. Unfortunately, the annotation speed of CN corpora cannot keep up with the growth of HS targets, while generating specific and effective CNs for unseen targets remains a significant challenge for the model. To tackle this issue, we propose Retrieval-Augmented Zero-shot Generation (ReZG) to generate CNs with high-specificity for unseen targets. Specifically, we propose a multi-dimensional hierarchical retrieval method that integrates stance, semantics, and fitness, extending the retrieval metric from single dimension to multiple dimensions suitable for the knowledge that refutes HS. Then, we implement an energy-based constrained decoding mechanism that enables PLMs to use differentiable knowledge preservation, countering, and fluency constraint functions instead of in-target CNs as control signals for generation, thereby achieving zero-shot CN generation. With the above techniques, ReZG can integrate external knowledge flexibly and improve the specificity of CNs. Experimental results show that ReZG exhibits stronger generalization capabilities and outperforms strong baselines with significant improvements of 2.0%+ in the relevance and 4.5%+ in the countering success rate metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. P. Kapil, A. Ekbal, A deep neural network based multi-task learning approach to hate speech detection, Knowledge-Based Systems 210 (2020) 106458.
  2. M. Williams, Hatred behind the screens: A report on the rise of online hate speech (2019).
  3. K. Müller, C. Schwarz, Fanning the flames of hate: Social media and hate crime, Journal of the European Economic Association 19 (2021) 2131–2167.
  4. Hate begets hate: A temporal study of hate speech, Proceedings of the ACM on Human-Computer Interaction 4 (2020) 1–24.
  5. Using pre-trained language models for producing counter narratives against hate speech: a comparative study, in: Findings of the Association for Computational Linguistics: ACL 2022, 2022, pp. 3099–3114.
  6. Language models are unsupervised multitask learners, OpenAI blog 1 (2019) 9.
  7. Italian counter narrative generation to fight online hate speech., in: CLiC-it, 2020.
  8. M. Ashida, M. Komachi, Towards automatic generation of messages countering online hate speech and microaggressions, in: Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), 2022, pp. 11–23.
  9. Exploring the limits of transfer learning with a unified text-to-text transformer, The Journal of Machine Learning Research 21 (2020) 5485–5551.
  10. Human-in-the-loop for data collection: a multi-target counter narrative dataset to fight online hate speech, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 3226–3240.
  11. C. Schieb, M. Preuss, Governing hate speech by means of counterspeech on facebook, in: 66th ica annual conference, at fukuoka, japan, 2016, pp. 1–23.
  12. Generating counter narratives against online hate speech: Data and strategies, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 1177–1190.
  13. W. Zhu, S. Bhat, Generate, prune, select: A pipeline for counterspeech generation against online hate speech, in: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, pp. 134–149.
  14. Towards knowledge-grounded counter narrative generation for hate speech, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021, pp. 899–914.
  15. Don’t take it literally: An edit-invariant sequence loss for text generation, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 2055–2078.
  16. Thou shalt not hate: Countering online hate speech, in: Proceedings of the international AAAI conference on web and social media, volume 13, 2019, pp. 369–380.
  17. A benchmark dataset for learning to intervene in online hate speech, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 4755–4764.
  18. Hate speech and counter speech detection: Conversational context does matter, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 5918–5930.
  19. Conan-counter narratives through nichesourcing: a multilingual dataset of responses to fight online hate speech, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 2819–2829. URL: https://www.aclweb.org/anthology/P19-1271. doi:10.18653/v1/P19-1271.
  20. Human-machine collaboration approaches to build a dialogue dataset for hate speech countering, arXiv preprint arXiv:2211.03433 (2022).
  21. J. D. M.-W. C. Kenton, L. K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
  22. Simcse: Simple contrastive learning of sentence embeddings, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 6894–6910.
  23. Cold decoding: Energy-based constrained text generation with langevin dynamics, arXiv preprint arXiv:2202.11705 (2022).
  24. A tutorial on energy-based learning, To appear in ”Predicting Structured Data” 1 (2006) 0.
  25. Dialogpt: Large-scale generative pre-training for conversational response generation, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020, pp. 270–278.
  26. Multitask prompted training enables zero-shot task generalization, in: ICLR 2022-Tenth International Conference on Learning Representations, 2022.
  27. Faithfulness in natural language generation: A systematic survey of analysis, evaluation and optimization methods, arXiv preprint arXiv:2203.05227 (2022).
  28. Harnessing the power of llms in practice: A survey on chatgpt and beyond, arXiv preprint arXiv:2304.13712 (2023).
  29. Gptscore: Evaluate as you desire, arXiv preprint arXiv:2302.04166 (2023).
  30. C.-H. Chiang, H.-y. Lee, Can large language models be an alternative to human evaluations?, in: Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics, 2023, pp. 15607–15631.
  31. W. Zhu, S. Bhat, Gruen for evaluating linguistic quality of generated text, in: Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020, Association for Computational Linguistics (ACL), 2020, pp. 94–108.
  32. Roberta: A robustly optimized bert pretraining approach, CoRR abs/1907.11692 (2019).
  33. What language model architecture and pretraining objective works best for zero-shot generalization?, in: International Conference on Machine Learning, PMLR, 2022, pp. 22964–22984.
  34. Bertscore: Evaluating text generation with bert, in: International Conference on Learning Representations, 2019.
  35. Infocse: Information-aggregated contrastive learning of sentence embeddings, in: Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 3060–3070.
  36. Pan more gold from the sand: Refining open-domain dialogue training with noisy self-retrieval generation, in: Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 636–647.
  37. Deeplink: A deep learning approach for user identity linkage, in: IEEE INFOCOM 2018-IEEE Conference on computer communications, IEEE, 2018, pp. 1313–1321.
  38. Interlayer link prediction in multiplex social networks based on multiple types of consistency between embedding vectors, IEEE Transactions on Cybernetics (2021) 1–14.
  39. Network structural perturbation against interlayer link prediction, Knowledge-Based Systems 250 (2022) 109095.
  40. Language (technology) is power: A critical survey of “bias” in NLP, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 5454–5476. URL: https://aclanthology.org/2020.acl-main.485. doi:10.18653/v1/2020.acl-main.485.
  41. E. Allaway, K. Mckeown, Zero-shot stance detection: A dataset and model using generalized topic representations, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 8913–8931.
  42. Semeval-2016 task 6: Detecting stance in tweets, in: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), 2016, pp. 31–41.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets