Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Examining Temporal Bias in Abusive Language Detection (2309.14146v1)

Published 25 Sep 2023 in cs.CL

Abstract: The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death. Machine learning models have been developed to automatically detect abusive language, but these models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time. This study aims to investigate the nature and impact of temporal bias in abusive language detection across various languages and explore mitigation methods. We evaluate the performance of models on abusive data sets from different time periods. Our results demonstrate that temporal bias is a significant challenge for abusive language detection, with models trained on historical data showing a significant drop in performance over time. We also present an extensive linguistic analysis of these abusive data sets from a diachronic perspective, aiming to explore the reasons for language evolution and performance decline. This study sheds light on the pervasive issue of temporal bias in abusive language detection across languages, offering crucial insights into language evolution and temporal bias mitigation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Building for tomorrow: Assessing the temporal persistence of text classifiers. Information Processing & Management, 60(2): 103200.
  2. Abusive Content Detection in Arabic Tweets Using Multi-Task Learning and Transformer-Based Models. Applied Sciences, 13(10): 5825.
  3. Abusive Bangla comments detection on Facebook using transformer-based deep learning models. Social Network Analysis and Mining, 12(1): 24.
  4. Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting. arXiv preprint arXiv:2203.07856.
  5. Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics. Transactions of the Association for Computational Linguistics, 9: 1249–1267.
  6. Transfer Learning for Multilingual Abusive Meme Detection. In Proceedings of the 15th ACM Web Science Conference 2023, 245–250.
  7. Examining Racial Bias in an Online Abuse Corpus with Structural Topic Modeling. arXiv preprint arXiv:2005.13041.
  8. Racial Bias in Hate Speech and Abusive Language Detection Datasets. arXiv preprint arXiv:1905.12516.
  9. Broad Twitter Corpus: A Diverse Named Entity Recognition Resource. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1169–1179.
  10. 8-bit optimizers via block-wise quantization. arXiv preprint arXiv:2110.02861.
  11. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.
  12. Dice, L. R. 1945. Measures of the Amount of Ecologic Association Between Species. Ecology, 26(3): 297–302.
  13. Measuring and Mitigating Unintended Bias in Text Classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 67–73.
  14. SOS: Systematic Offensive Stereotyping Bias in Word Embeddings. In Proceedings of the 29th International Conference on Computational Linguistics, 1263–1274.
  15. Time of Your Hate: The Challenge of Time in Hate Speech Detection on Social Media. Applied Sciences, 10(12): 4180.
  16. Equalizing Gender Biases in Neural Machine Translation with Word Embeddings Techniques. arXiv preprint arXiv:1901.03116.
  17. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior. In Proceedings of the international AAAI conference on web and social media, volume 12.
  18. A Survey on Bias in Deep NLP. Applied Sciences, 11(7): 3184.
  19. Five sources of bias in natural language processing. Language and Linguistics Compass, 15(8): e12432.
  20. SWSR: A Chinese dataset and lexicon for online sexism detection. Online Social Networks and Media, 27: 100182.
  21. Data preprocessing techniques for classification without discrimination. Knowledge and information systems, 33(1): 1–33.
  22. Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement. arXiv preprint arXiv:2109.13563.
  23. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
  24. Sentiment analysis under temporal shift. In Proceedings of the 9th workshop on computational approaches to subjectivity, sentiment and social media analysis, 65–71.
  25. Leveraging time-dependent lexical features for offensive language detection. In Proceedings of the The First Workshop on Ever Evolving NLP (EvoNLP), 39–54.
  26. Abusive language Detection with Graph Convolutional Networks. arXiv preprint arXiv:1904.04073.
  27. It’s about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits. In Findings of the Association for Computational Linguistics: EACL 2023, 724–731.
  28. Examining Temporalities on Stance Detection Towards COVID-19 Vaccination. arXiv preprint arXiv:2304.04806.
  29. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in big data, 2: 13.
  30. Reducing Gender Bias in Abusive Language Detection. arXiv preprint arXiv:1808.07231.
  31. Detecting and Monitoring Hate Speech in Twitter. Sensors, 19(21): 4654.
  32. The chilling: Global trends in online violence against women journalists. New York: United Nations International Children’s Emergency Fund (UNICEF).
  33. An Italian Twitter Corpus of Hate Speech against Immigrants. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018).
  34. The Risk of Racial Bias in Hate Speech Detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1668–1678. Florence, Italy: Association for Computational Linguistics.
  35. Mitigating Gender Bias in Natural Language Processing: Literature Review. arXiv preprint arXiv:1906.08976.
  36. LLaMA: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971.
  37. Attention is All you Need. Advances in neural information processing systems, 30.
  38. Challenges and frontiers in abusive content detection. Association for Computational Linguistics.
  39. Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection. arXiv preprint arXiv:2012.15761.
  40. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL student research workshop, 88–93.
  41. Bias and comparison framework for abusive language datasets. AI and Ethics, 1–23.
  42. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. arXiv preprint arXiv:1804.06876.
  43. On the Impact of Temporal Concept Drift on Model Explanations. arXiv preprint arXiv:2210.09197.
  44. Zhou, X. 2021. Challenges in Automated Debiasing for Toxic Language Detection. University of Washington.
  45. Reducing Unintended Identity Bias in Russian Hate Speech Detection. arXiv preprint arXiv:2010.11666.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mali Jin (10 papers)
  2. Yida Mu (14 papers)
  3. Diana Maynard (12 papers)
  4. Kalina Bontcheva (64 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.