NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data (2403.19260v3)
Abstract: To address the global issue of online hate, hate speech detection (HSD) systems are typically developed on datasets from the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on non-representative samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce NaijaHate, the first dataset annotated for HSD which contains a representative sample of Nigerian tweets. We demonstrate that HSD evaluated on biased datasets traditionally used in the literature consistently overestimates real-world performance by at least two-fold. We then propose NaijaXLM-T, a pretrained model tailored to the Nigerian Twitter context, and establish the key role played by domain-adaptive pretraining and finetuning in maximizing HSD performance. Finally, owing to the modest performance of HSD systems in real-world conditions, we find that content moderators would need to review about ten thousand Nigerian tweets flagged as hateful daily to moderate 60% of all hateful content, highlighting the challenges of moderating hate speech at scale as social media usage continues to grow globally. Taken together, these results pave the way towards robust HSD systems and a better protection of social media users from hateful content in low-resource settings.
- Detection of offensive and threatening online content in a low resource language. arXiv preprint arXiv:2311.10541.
- MasakhaNER: Named entity recognition for African languages. Transactions of the Association for Computational Linguistics, 9:1116–1131.
- Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning. EPJ Data Science, 11(1):8.
- Adapting pre-trained language models to African languages via multilingual adaptive fine-tuning. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4336–4349, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Herdphobia: A dataset for hate speech against fulani in nigeria. arXiv preprint arXiv:2211.15262.
- Hate speech detection is not as easy as you may think: A closer look at model validation. In Proceedings of the 42nd international acm sigir conference on research and development in information retrieval, pages 45–54.
- A unified approach to active dual supervision for labeling features and examples. In Joint European conference on machine learning and knowledge discovery in databases, pages 40–55. Springer.
- XLM-T: Multilingual language models in Twitter for sentiment analysis and beyond. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 258–266, Marseille, France. European Language Resources Association.
- SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 54–63, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
- DeepPavlov: Open-source library for dialogue systems. In Proceedings of ACL 2018, System Demonstrations, pages 122–127, Melbourne, Australia. Association for Computational Linguistics.
- Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
- Automated hate speech detection and the problem of offensive language. Eleventh International AAAI Conference on Web and Social Media, 11(1).
- Christian Ezeibe. 2021. Hate speech and election violence in nigeria. Journal of Asian and African Studies, 56(4):919–935.
- RO Farinde and HO Omolaiye. 2020. A socio-pragmatic investigation of language of insults in the utterances of yoruba natives in nigeria. Advances in Language and Literary Studies, 11(6):1–6.
- W Ferroggiaro. 2018. Social media and conflict in nigeria: A lexicon of hate speech terms.
- Mechachal: Online debates and elections in ethiopia-from hate speech to engagement in social media. Available at SSRN 2831369.
- Detecting cross-geographic biases in toxicity modeling on social media. In Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021), pages 313–328, Online. Association for Computational Linguistics.
- N Giansiracusa. 2021. Facebook uses deceptive math to hide its hate speech problem. Wired.
- Tarleton Gillespie. 2018. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press.
- Tarleton Gillespie. 2020. Content moderation, ai, and the question of scale. Big Data & Society, 7(2):2053951720943234.
- Algorithmic content moderation: Technical and political challenges in the automation of platform governance. Big Data & Society, 7(1):2053951719897945.
- An investigation of large language models for real-world hate speech detection. arXiv preprint arXiv:2401.03346.
- Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, Online. Association for Computational Linguistics.
- Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv preprint arXiv:2111.09543.
- Social approval and network homophily as motivators of online toxicity. arXiv preprint arXiv:2310.07779.
- Rafael Jiménez Durán. 2021. The economics of content moderation: Theory and experimental evidence from hate speech on twitter. Available at SSRN 4044098.
- Hate speech detection in limited data contexts using synthetic data generation. ACM Journal on Computing and Sustainable Societies, 2(1):1–18.
- Is more data better? re-thinking the importance of efficiency in abusive language detection with transformers-based active learning. In Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022), pages 52–61, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Human-ai collaboration via conditional delegation: A case study of content moderation. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–18.
- A new generation of perspective api: Efficient multilingual character-level transformers. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3197–3207.
- Peiyu Li. 2021. Achieving hate speech detection in a low resource setting. Ph.D. thesis, Utah State University.
- Ruth Maclean. 2021. Nigeria bans twitter after president’s tweet is deleted. The New York Times.
- A holistic approach to undesired content detection in the real world. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 15009–15018.
- Hatexplain: A benchmark dataset for explainable hate speech detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14867–14875.
- Dan Milmo. 2021. Frances haugen: ‘i never wanted to be a whistleblower. but lives were in danger’. The Guardian.
- A measurement study of hate speech in social media. In Proceedings of the 28th ACM conference on hypertext and social media, pages 85–94.
- Karsten Müller and Carlo Schwarz. 2021. Fanning the flames of hate: Social media and hate crime. Journal of the European Economic Association, 19(4):2131–2167.
- Detection of hate speech code mix involving english and other nigerian languages. Journal of Information Systems and Informatics, 5(4):1416–1431.
- Isar Nejadgholi and Svetlana Kiritchenko. 2020. On cross-dataset generalization in automatic detection of online abuse. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 173–183, Online. Association for Computational Linguistics.
- BERTweet: A pre-trained language model for English tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 9–14, Online. Association for Computational Linguistics.
- Tackling hate speech in low-resource languages with context experts. In Proceedings of the 2022 International Conference on Information and Communication Technologies and Development, pages 1–11.
- Debora Nozza. 2021. Exposing the limits of zero-shot cross-lingual hate speech detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 907–914, Online. Association for Computational Linguistics.
- Small data? no problem! exploring the viability of pretrained multilingual language models for low-resourced languages. In Proceedings of the 1st Workshop on Multilingual Representation Learning, pages 116–126, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Paul Ayodele Onanuga. 2023. # arewaagainstlgbtq discourse: a vent for anti-homonationalist ideology in nigerian twittersphere? African Identities, 21(4):703–725.
- Lifting the curse of multilinguality by pre-training modular transformers. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3479–3495, Seattle, United States. Association for Computational Linguistics.
- Leveraging label variation in large language models for zero-shot text classification.
- Respectful or toxic? using zero-shot learning with language models to detect hate speech. In The 7th Workshop on Online Abuse and Harms (WOAH), pages 60–68, Toronto, Canada. Association for Computational Linguistics.
- Resources and benchmark corpora for hate speech detection: a systematic review. Language Resources and Evaluation, 55:477–523.
- Data-efficient strategies for expanding hate speech detection into under-resourced languages. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5674–5691, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Two contrasting data annotation paradigms for subjective NLP tasks. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 175–190, Seattle, United States. Association for Computational Linguistics.
- HateCheck: Functional tests for hate speech detection models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 41–58, Online. Association for Computational Linguistics.
- Sumegh Roychowdhury and Vikram Gupta. 2023. Data-efficient methods for improving hate speech detection. In Findings of the Association for Computational Linguistics: EACL 2023, pages 125–132, Dubrovnik, Croatia. Association for Computational Linguistics.
- Multilingual detection of personal employment status on Twitter. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6564–6587, Dublin, Ireland. Association for Computational Linguistics.
- Collins Udanor and Chinatu C Anyanwu. 2019. Combating the challenges of social media hate speech in a polarized society: A twitter ego lexalytics approach. Data Technologies and Applications.
- UN. 2019. Plan of action on hate speech.(2019). Technical report.
- TREC: Experiment and evaluation in information retrieval, volume 63. MIT press Cambridge.
- Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on Twitter. In Proceedings of the NAACL Student Research Workshop, pages 88–93, San Diego, California. Association for Computational Linguistics.
- Detection of Abusive Language: the Problem of Biased Datasets. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 602–608, Minneapolis, Minnesota. Association for Computational Linguistics.
- Predicting the type and target of offensive posts in social media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1415–1420, Minneapolis, Minnesota. Association for Computational Linguistics.
- Manuel Tonneau (7 papers)
- Pedro Vitor Quinta de Castro (1 paper)
- Karim Lasri (6 papers)
- Ibrahim Farouq (1 paper)
- Lakshminarayanan Subramanian (17 papers)
- Victor Orozco-Olvera (2 papers)
- Samuel P. Fraiberger (8 papers)