Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 431 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Detection (2309.11896v2)

Published 21 Sep 2023 in cs.CL and cs.CY

Abstract: Although pretrained LLMs (PLMs) have achieved state-of-the-art on many NLP tasks, they lack an understanding of subtle expressions of implicit hate speech. Various attempts have been made to enhance the detection of implicit hate by augmenting external context or enforcing label separation via distance-based metrics. Combining these two approaches, we introduce FiADD, a novel Focused Inferential Adaptive Density Discrimination framework. FiADD enhances the PLM finetuning pipeline by bringing the surface form/meaning of an implicit hate speech closer to its implied form while increasing the inter-cluster distance among various labels. We test FiADD on three implicit hate datasets and observe significant improvement in the two-way and three-way hate classification tasks. We further experiment on the generalizability of FiADD on three other tasks, detecting sarcasm, irony, and stance, in which surface and implied forms differ, and observe similar performance improvements. Consequently, we analyze the generated latent space to understand its evolution under FiADD, which corroborates the advantage of employing FiADD for implicit hate speech detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. SemEval-2022 Task 6: iSarcasmEval, Intended Sarcasm Detection in English and Arabic. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, 802–814. https://doi.org/10.18653/v1/2022.semeval-1.111
  2. Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT. IEEE Access 9 (2021), 106363–106374. https://doi.org/10.1109/ACCESS.2021.3100435
  3. Hate speech detection in the Indonesian language: A dataset and preliminary study. In 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS). 233–238. https://doi.org/10.1109/ICACSIS.2017.8355039
  4. Raghad Alshaalan and Hend Al-Khalifa. 2020. Hate Speech Detection in Saudi Twittersphere: A Deep Learning Approach. In Proceedings of the Fifth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Barcelona, Spain (Online), 12–23. https://aclanthology.org/2020.wanlp-1.2
  5. RP-Mod & RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, J. Vanschoren and S. Yeung (Eds.), Vol. 1. Curran. https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/c9e1074f5b3f9fc8ea15d152add07294-Paper-round2.pdf
  6. Deep Learning for Hate Speech Detection in Tweets. In WWW. 759–760.
  7. HateBERT: Retraining BERT for Abusive Language Detection in English. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). Association for Computational Linguistics, Online, 17–25. https://doi.org/10.18653/v1/2021.woah-1.3
  8. I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6193–6202. https://aclanthology.org/2020.lrec-1.760
  9. Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 1320–1329. https://doi.org/10.1109/CVPR.2017.145
  10. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 6170–6182. https://doi.org/10.18653/v1/2022.acl-long.427
  11. Detecting Hate Speech with GPT-3. arXiv:2103.12407 [cs.CL]
  12. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. 539–546 vol. 1. https://doi.org/10.1109/CVPR.2005.202
  13. Hindi-English Hate Speech Detection: Author Profiling, Debiasing, and Practical Perspectives. Proceedings of the AAAI Conference on Artificial Intelligence 34, 01 (Apr. 2020), 386–393. https://doi.org/10.1609/aaai.v34i01.5374
  14. Deep Divergence Learning. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 2027–2037. https://proceedings.mlr.press/v119/cilingir20a.html
  15. Automated Hate Speech Detection and the Problem of Offensive Language. Proceedings of the International AAAI Conference on Web and Social Media 11, 1 (May 2017), 512–515. https://ojs.aaai.org/index.php/ICWSM/article/view/14955
  16. Hate Speech Dataset from a White Supremacy Forum. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium, 11–20. https://doi.org/10.18653/v1/W18-5102
  17. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
  18. Latent Hatred: A Benchmark for Understanding Implicit Hate Speech. In EMNLP.
  19. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior. Proceedings of the International AAAI Conference on Web and Social Media 12, 1 (Jun. 2018). https://ojs.aaai.org/index.php/ICWSM/article/view/14991
  20. A Unified Deep Learning Architecture for Abuse Detection. In WebSci. 105–114.
  21. Handling Bias in Toxic Speech Detection: A Survey. ACM Comput. Surv. 55, 13s, Article 264 (jul 2023), 32 pages. https://doi.org/10.1145/3580494
  22. Koyel Ghosh and Dr. Apurbalal Senapati. 2022. Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation. In Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation. De La Salle University, Manila, Philippines, 853–865. https://aclanthology.org/2022.paclic-1.94
  23. Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 5792–5809. https://doi.org/10.18653/v1/2023.acl-long.318
  24. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (Nov. 1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
  25. Muhammad Okky Ibrohim and Indra Budi. 2019. Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. In Proceedings of the Third Workshop on Abusive Language Online. Association for Computational Linguistics, Florence, Italy, 46–57. https://doi.org/10.18653/v1/W19-3506
  26. The Gab Hate Corpus. (2022). https://doi.org/10.17605/OSF.IO/EDUA3
  27. Generalizable Implicit Hate Speech Detection Using Contrastive Learning. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 6667–6679. https://aclanthology.org/2022.coling-1.579
  28. Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Long Beach, CA, USA) (KDD ’23). Association for Computing Machinery, New York, NY, USA, 4333–4345. https://doi.org/10.1145/3580305.3599896
  29. R. Likert. 1932. A technique for the measurement of attitudes. Archives of Psychology 22 140 (1932), 55–55.
  30. Jessica Lin. 2022. Leveraging World Knowledge in Implicit Hate Speech Detection. In Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 31–39. https://aclanthology.org/2022.nlp4pi-1.4
  31. Focal Loss for Dense Object Detection. In 2017 IEEE International Conference on Computer Vision (ICCV). 2999–3007. https://doi.org/10.1109/ICCV.2017.324
  32. Large-Margin Softmax Loss for Convolutional Neural Networks. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 48), Maria Florina Balcan and Kilian Q. Weinberger (Eds.). PMLR, New York, New York, USA, 507–516. https://proceedings.mlr.press/v48/liud16.html
  33. Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 3524–3534. https://doi.org/10.1145/3534678.3539161
  34. Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). 504–515. https://doi.org/10.1109/ICDE51399.2021.00050
  35. HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14867–14875.
  36. SemEval-2016 Task 6: Detecting Stance in Tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 31–41. https://doi.org/10.18653/v1/S16-1003
  37. ETHOS: a multi-label hate speech detection dataset. Complex & Intelligent Systems 8, 6 (01 Dec 2022), 4663–4678. https://doi.org/10.1007/s40747-021-00608-2
  38. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 9–14. https://doi.org/10.18653/v1/2020.emnlp-demos.2
  39. Debora Nozza. 2021. Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Online, 907–914. https://doi.org/10.18653/v1/2021.acl-short.114
  40. Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech. In The 7th Workshop on Online Abuse and Harms (WOAH). Association for Computational Linguistics, Toronto, Canada, 60–68. https://doi.org/10.18653/v1/2023.woah-1.6
  41. Lexicon Enriched Hybrid Hate Speech Detection with Human-Centered Explanations. In Adjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization (Barcelona, Spain) (UMAP ’22 Adjunct). Association for Computing Machinery, New York, NY, USA, 184–191. https://doi.org/10.1145/3511047.3537688
  42. Metric Learning with Adaptive Density Discrimination. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1511.05939
  43. Hate-Speech and Offensive Language Detection in Roman Urdu. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 2512–2522. https://doi.org/10.18653/v1/2020.emnlp-main.197
  44. CounterGeDi: A Controllable Approach to Generate Polite, Detoxified and Emotional Counterspeech. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Lud De Raedt (Ed.). International Joint Conferences on Artificial Intelligence Organization, 5157–5163. https://doi.org/10.24963/ijcai.2022/716 AI for Good.
  45. Better Prevent than React: Deep Stratified Learning to Predict Hate Intensity of Twitter Reply Chains. In 2021 IEEE International Conference on Data Mining (ICDM). 549–558. https://doi.org/10.1109/ICDM51629.2021.00066
  46. Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media. Proceedings of the International AAAI Conference on Web and Social Media 12, 1 (Jun. 2018). https://doi.org/10.1609/icwsm.v12i1.15028
  47. Social Bias Frames: Reasoning about Social and Power Implications of Language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5477–5490. https://doi.org/10.18653/v1/2020.acl-main.486
  48. Anna Schmidt and Michael Wiegand. 2017. A Survey on Hate Speech Detection using Natural Language Processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. Association for Computational Linguistics, Valencia, Spain, 1–10. https://doi.org/10.18653/v1/W17-1101
  49. FaceNet: A Unified Embedding for Face Recognition and Clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  50. Analyzing the Targets of Hate in Online Social Media. Proceedings of the International AAAI Conference on Web and Social Media 10, 1 (Aug. 2021), 687–690. https://ojs.aaai.org/index.php/ICWSM/article/view/14811
  51. Deep Metric Learning via Facility Location. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2206–2214. https://doi.org/10.1109/CVPR.2017.237
  52. Rohit Sridhar and Diyi Yang. 2022. Explaining Toxic Text via Knowledge Enhanced Text Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 811–826. https://doi.org/10.18653/v1/2022.naacl-main.59
  53. Cleansing & expanding the HURTLEX(el) with a multidimensional categorization of offensive words. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH). Association for Computational Linguistics, Seattle, Washington (Hybrid), 102–108. https://doi.org/10.18653/v1/2022.woah-1.10
  54. J. Suler. 2004. The Online Disinhibition Effect. Cyberpsychology & behavior : the impact of the Internet, multimedia and virtual reality on behavior and society 7 3 (2004), 321–326.
  55. Large-Scale Hate Speech Detection with Cross-Domain Transfer. In Proceedings of the Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 2215–2225. https://aclanthology.org/2022.lrec-1.238
  56. SemEval-2018 Task 3: Irony Detection in English Tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation. Association for Computational Linguistics, New Orleans, Louisiana, 39–50. https://doi.org/10.18653/v1/S18-1005
  57. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  58. Graph Attention Networks. 6th International Conference on Learning Representations (2017).
  59. Bertie Vidgen and Leon Derczynski. 2020. Directions in abusive language training data, a systematic review: Garbage in, garbage out. PLOS ONE 15, 12 (Dec. 2020), e0243300. https://doi.org/10.1371/journal.pone.0243300
  60. Introducing CAD: the Contextual Abuse Dataset. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 2289–2303. https://doi.org/10.18653/v1/2021.naacl-main.182
  61. Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 1667–1682. https://doi.org/10.18653/v1/2021.acl-long.132
  62. Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL Student Research Workshop. Association for Computational Linguistics, San Diego, California, 88–93. https://doi.org/10.18653/v1/N16-2013
  63. Implicitly Abusive Language – What does it actually look like and why are we not getting there?. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 576–587. https://doi.org/10.18653/v1/2021.naacl-main.48
  64. Ex Machina: Personal Attacks Seen at Scale. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1391–1399. https://doi.org/10.1145/3038912.3052591
  65. Wenjie Yin and Arkaitz Zubiaga. 2021. Towards generalisable hate speech detection: a review on obstacles and solutions. PeerJ Computer Science 7 (June 2021), e598. https://doi.org/10.7717/peerj-cs.598
  66. How Hate Speech Varies by Target Identity: A Computational Analysis. In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 27–39. https://doi.org/10.18653/v1/2022.conll-1.3

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.