BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Abstract: Since the Internet is flooded with hate, it is one of the main tasks for NLP experts to master automated online content moderation. However, advancements in this field require improved access to publicly available accurate and non-synthetic datasets of social media content. For the Polish language, such resources are very limited. In this paper, we address this gap by presenting a new open dataset of offensive social media content for the Polish language. The dataset comprises content from Wykop.pl, a popular online service often referred to as the "Polish Reddit", reported by users and banned in the internal moderation process. It contains a total of 691,662 posts and comments, evenly divided into two categories: "harmful" and "neutral" ("non-harmful"). The anonymized subset of the BAN-PL dataset consisting on 24,000 pieces (12,000 for each class), along with preprocessing scripts have been made publicly available. Furthermore the paper offers valuable insights into real-life content moderation processes and delves into an analysis of linguistic features and content characteristics of the dataset. Moreover, a comprehensive anonymization procedure has been meticulously described and applied. The prevalent biases encountered in similar datasets, including post-moderation and pre-selection biases, are also discussed.
- A review on abusive content automatic detection: approaches, challenges and opportunities. PeerJ Computer Science, 8:e1142.
- Hate speech detection is not as easy as you may think: A closer look at model validation (extended version). Information Systems, 105.
- Tweeteval: Unified benchmark and comparative evaluation for tweet classification. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1644–1650.
- Progressive domain adaptation for detecting hate speech on social media with small training set and its application to covid-19 concerned posts. Social Network Analysis and Mining, 11:1–18.
- Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In Proceedings of the 13th international workshop on semantic evaluation, pages 54–63.
- Danah Boyd. 2008. Why youth (heart) social network sites: The role of networked publics in teenage social life. YOUTH, IDENTITY, AND DIGITAL MEDIA, David Buckingham, ed., The John D. and Catherine T. MacArthur Foundation Series on Digital Media and Learning, The MIT Press, Cambridge, MA, pages 119–142.
- Iwona Burkacka. 2020. Janusze, halyny, sebixy i karyny. memy internetowe jako źródło nowych eponimów. Poradnik Językowy, (04):21–34.
- Hatebert: Retraining bert for abusive language detection in english. arXiv preprint arXiv:2010.12472.
- Sylwia Czubkowska and Jakub Wątor. 2021. Wykop, hejt i zamawianie zlewów. tak się bawi największy polski serwis społecznościowy.
- Pre-training polish transformer-based language models at scale. In Artificial Intelligence and Soft Computing: 19th International Conference, ICAISC 2020, Zakopane, Poland, October 12-14, 2020, Proceedings, Part II 19, pages 301–314. Springer.
- Improving cyberbullying detection with user context. In Advances in Information Retrieval: 35th European Conference on IR Research, ECIR 2013, Moscow, Russia, March 24-27, 2013. Proceedings 35, pages 693–696. Springer.
- Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, volume 11, pages 512–515.
- Common sense reasoning for detection, prevention, and mitigation of cyberbullying. ACM Transactions on Interactive Intelligent Systems (TiiS), 2(3):1–30.
- Modeling the detection of textual cyberbullying. In Proceedings of the International AAAI Conference on Web and Social Media, volume 5, pages 11–17.
- Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity. Language Resources and Evaluation, 55:597–633.
- How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Information Processing & Management, 58(3):102524.
- Large scale crowdsourcing and characterization of twitter abusive behavior. In Proceedings of the international AAAI conference on web and social media, volume 12.
- Handling bias in toxic speech detection: A survey. ACM Computing Surveys.
- gazeta.pl. 2006. Wykop.pl, czyli digg.com po polsku. Gazeta Wyborcza.
- All you need is" love" evading hate speech detection. In Proceedings of the 11th ACM workshop on artificial intelligence and security, pages 2–12.
- Hugo Lewi Hammer. 2017. Automatic detection of hateful comments in online discussion. In Industrial Networks and Intelligent Systems: Second International Conference, INISCOM 2016, Leicester, UK, October 31–November 1, 2016, Proceedings 2. Springer International Publishing.
- Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear.
- Md Saroar Jahan and Mourad Oussalah. 2023. A systematic review of hate speech automatic detection using natural language processing. Neurocomputing, page 126232.
- Mladen Karan and Jan Šnajder. 2018. Cross-domain detection of abusive language online. In Proceedings of the 2nd workshop on abusive language online (ALW2), pages 132–137.
- Abusive content detection in online user-generated data: a survey. Procedia Computer Science, 189:274–281.
- Human-centered neural reasoning for subjective content processing: Hate speech, emotions, and humor. Information Fusion, 94:43–65.
- Aleksandra Klimkiewicz. 2022. # rosjatostanumyslu# 60kopiejekzawpis: stereotypy etniczne zaklęte w hasztagach. Przegląd Rusycystyczny, (4 (180)).
- Poleval 2019—the next chapter in evaluating natural language processing tools for polish. pages 165–172.
- Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach. Information Processing & Management, 58(5):102643.
- Hurtbert: Incorporating lexical features with bert for the detection of abusive language. In Proceedings of the fourth workshop on online abuse and harms, pages 34–43. Association for Computational Linguistics.
- Adam Kręgielewski and Monika Turek. 2020. Przewodnik po social media w polsce.
- Michał Kurcwald. 2015. Copypasta: na granicach literatury, grup społecznych i dyscyplin badawczych. In Literatura na granicach. Monografia naukowa. Kraków: AT Wydawnictwo.
- Datasets of Slovene and Croatian moderated news comments. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 124–131, Brussels, Belgium. Association for Computational Linguistics.
- Improving generalization of hate speech detection systems to novel target groups via domain adaptation. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 29–39.
- Hate speech detection: Challenges and solutions. PloS one, 14(8):e0221152.
- In data we trust: A critical analysis of hate speech detection datasets. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 150–161, Online. Association for Computational Linguistics.
- Michał Marcińczuk and Jarema Radom. 2021. A single-run recognition of nested named entities with transformers. Procedia Computer Science, 192:291–297.
- Recognition of named entities for polish-comparison of deep learning and conditional random fields approaches. In Proceedings of the PolEval 2018 Workshop, pages 77–92. Institute of Computer Science, Polish Academy of Science.
- Inez Okulska and Anna Kołos. 2023. A morpho-syntactic analysis of human-moderated hate speech samples from wykop. pl web service. Półrocznik Językoznawczy Tertium, 8(2):54–71.
- Inez Okulska and Anna Zawadzka. 2022. Styles with benefits. the stylometrix vectors for stylistic and semantic text classification of small-scale datasets and different sample length.
- Resources and benchmark corpora for hate speech detection: a systematic review. Language Resources and Evaluation, 55:477–523.
- Machine learning and affect analysis against cyber-bullying. the 36th AISB, pages 7–16.
- Sustainable cyberbullying detection with category-maximized relevance of harmful phrases and double-filtered automatic optimization. International Journal of Child-Computer Interaction, 8:15–30.
- Results of the poleval 2019 shared task 6: First dataset and open shared task for automatic cyberbullying detection in polish twitter.
- Expert-annotated dataset to study cyberbullying in polish language. Data, 9(1):1.
- A benchmark dataset for learning to intervene in online hate speech. arXiv preprint arXiv:1909.04251.
- Oskar Rak. 2023. Mirkomowa dla początkujących - słownik wykopowiczów.
- Sergio Rojas-Galeano. 2017. On obstructing obscenity obfuscation. ACM Transactions on the Web (TWEB), 11(2):1–24.
- Measuring the reliability of hate speech annotations: The case of the european refugee crisis. arXiv preprint arXiv:1701.08118.
- Two contrasting data annotation paradigms for subjective nlp tasks. arXiv preprint arXiv:2112.07475.
- Klej: Comprehensive benchmark for polish language understanding. arXiv preprint arXiv:2005.00630.
- Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In Proceedings of the International AAAI Conference on Web and Social Media, volume 12.
- Yisi Sang and Jeffrey Stanton. 2022. The origin and value of disagreement among data labelers: A case study of individual differences in hate speech annotation. In Information for a Better World: Shaping the Global Future: 17th International Conference, iConference 2022, Virtual Event, February 28–March 4, 2022, Proceedings, Part I, pages 425–444. Springer.
- The risk of racial bias in hate speech detection. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 1668–1678.
- Annotators with attitudes: How annotator beliefs and identities bias toxic language detection. arXiv preprint arXiv:2111.07997.
- Generalizability of abusive language detection models on homogeneous german datasets. Datenbank-Spektrum, pages 1–11.
- Rafał Sowiński. 2018. Rola systemu tagów w serwisie wykop. pl folksonomia czy memy? Zeszyty Naukowe Państwowej Wyższej Szkoły Zawodowej im. Witelona w Legnicy, 3(28):201–212.
- Ellen Spertus. 1997. Smokey: Automatic recognition of hostile messages. In Aaai/iaai, pages 1058–1065.
- Studying generalisability across abusive language detection datasets. In Proceedings of the 23rd conference on computational natural language learning (CoNLL), pages 940–950.
- TrelBERT: A pre-trained encoder for Polish Twitter. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023), pages 17–24, Dubrovnik, Croatia. Association for Computational Linguistics.
- Bridging the gaps: Multi task learning for domain transfer of hate speech detection. Online harassment, pages 29–55.
- Marek Troszyński and Aleksander Wawer. 2017. Czy komputer rozpozna hejtera? wykorzystanie uczenia maszynowego (ml) w jakościowej analizie danych. Przegląd Socjologii Jakościowej, 13(2):62–80.
- Bertie Vidgen and Leon Derczynski. 2020. Directions in abusive language training data, a systematic review: Garbage in, garbage out. Plos one, 15(12):e0243300.
- William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the second workshop on language in social media, pages 19–26.
- Zeerak Waseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science, pages 138–142.
- Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, pages 88–93.
- Detection of abusive language: the problem of biased datasets. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers), pages 602–608.
- Wirtualnemedia. 2018. Wśród polskich użytkowników twittera przeważają mężczyźni, osoby z dużych miast i ze średnim lub wyższym wykształceniem (analiza).
- Wirtualnemedia. 2023. Na tiktoku godzinę dłużej niż na facebooku. twitter dalej traci polskich internautów.
- Wirtualnemedia.pl. 2014. Wykop.pl rekordowo popularny. jacy są jego użytkownicy?
- Ex machina: Personal attacks seen at scale. In Proceedings of the 26th international conference on world wide web, pages 1391–1399.
- Piotr Wójcik. 2021. A wy za ile harujecie? Dziennik Gazeta Prawna, 49 (5457):11.
- Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies, pages 656–666.
- Apeach: Attacking pejorative expressions with analysis on crowd-generated hate speech evaluation datasets. arXiv preprint arXiv:2202.12459.
- Wenjie Yin and Arkaitz Zubiaga. 2021. Towards generalisable hate speech detection: a review on obstacles and solutions. PeerJ Computer Science, 7:e598.
- Separating hate speech and offensive language classes via adversarial debiasing. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 1–10.
- Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666.
- Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983.
- Automatic detection of cyberbullying on social networks based on bullying features. In Proceedings of the 17th international conference on distributed computing and networking, pages 1–6.
- Making harmful behaviors unlearnable for large language models. arXiv preprint arXiv:2311.02105.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.