2000 character limit reached
A Dataset for the Detection of Dehumanizing Language (2402.08764v1)
Published 13 Feb 2024 in cs.CL
Abstract: Dehumanization is a mental process that enables the exclusion and ill treatment of a group of people. In this paper, we present two data sets of dehumanizing text, a large, automatically collected corpus and a smaller, manually annotated data set. Both data sets include a combination of political discourse and dialogue from movie subtitles. Our methods give us a broad and varied amount of dehumanization data to work with, enabling further exploratory analysis and automatic classification of dehumanization patterns. Both data sets will be publicly released.
- Attributing human uniqueness and human nature to cultural groups: Distinct forms of subtle dehumanization. Group Processes & Intergroup Relations, 12(6):789–805.
- Emile Bruneau and Nour Kteily. 2017. The enemy as animal: Symmetric dehumanization during asymmetric warfare. PloS one, 12(7):e0181422.
- Hatebert: Retraining bert for abusive language detection in english. arXiv preprint arXiv:2010.12472.
- Erin C Cassese. 2021. Partisan dehumanization in american politics. Political Behavior, 43:29–50.
- Liberals and conservatives rely on different sets of moral foundations. Journal of personality and social psychology, 96(5):1029.
- Nick Haslam. 2006. Dehumanization: An integrative review. Personality and Social Psychology Review, 10(3):252–264. PMID: 16859440.
- Klaus Krippendorff. 2011. Computing krippendorff’s alpha-reliability.
- Nour S Kteily and Alexander P Landry. 2022. Dehumanization: Trends, insights, and challenges. Trends in cognitive sciences.
- Pierre Lison and Jörg Tiedemann. 2016. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles.
- Hatexplain: A benchmark dataset for explainable hate speech detection. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 14867–14875.
- White fear, dehumanization, and low empathy: Lethal combinations for shooting biases. Cultural diversity and ethnic minority psychology, 22(3):322.
- A framework for the computational linguistic analysis of dehumanization. Frontiers in artificial intelligence, 3:55.
- Tackling online abuse: A survey of automated abuse detection methods. CoRR, abs/1908.06024.
- Saif Mohammad. 2018. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long papers), pages 174–184.
- ETHOS: an online hate speech detection dataset. CoRR, abs/2006.08328.
- The measurement of meaning. 47. University of Illinois press.
- Jan Pomikálek. 2011. Justext.
- Predicting support for arabs’ autonomy from social dominance: The role of identity complexity and dehumanization. Political Psychology, 37(2):293–301.
- A benchmark dataset for learning to intervene in online hate speech. CoRR, abs/1909.04251.
- James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology, 39(6):1161.
- The measuring hate speech corpus: Leveraging rasch measurement theory for data perspectivism. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pages 83–94.
- Analyzing the targets of hate in online social media. In Proceedings of the International AAAI Conference on Web and Social Media, volume 10, pages 687–690.
- Content-driven detection of cyberbullying on the instagram social network. In IJCAI, volume 16, pages 3952–3958.