Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Dataset for the Detection of Dehumanizing Language (2402.08764v1)

Published 13 Feb 2024 in cs.CL

Abstract: Dehumanization is a mental process that enables the exclusion and ill treatment of a group of people. In this paper, we present two data sets of dehumanizing text, a large, automatically collected corpus and a smaller, manually annotated data set. Both data sets include a combination of political discourse and dialogue from movie subtitles. Our methods give us a broad and varied amount of dehumanization data to work with, enabling further exploratory analysis and automatic classification of dehumanization patterns. Both data sets will be publicly released.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Attributing human uniqueness and human nature to cultural groups: Distinct forms of subtle dehumanization. Group Processes & Intergroup Relations, 12(6):789–805.
  2. Emile Bruneau and Nour Kteily. 2017. The enemy as animal: Symmetric dehumanization during asymmetric warfare. PloS one, 12(7):e0181422.
  3. Hatebert: Retraining bert for abusive language detection in english. arXiv preprint arXiv:2010.12472.
  4. Erin C Cassese. 2021. Partisan dehumanization in american politics. Political Behavior, 43:29–50.
  5. Liberals and conservatives rely on different sets of moral foundations. Journal of personality and social psychology, 96(5):1029.
  6. Nick Haslam. 2006. Dehumanization: An integrative review. Personality and Social Psychology Review, 10(3):252–264. PMID: 16859440.
  7. Klaus Krippendorff. 2011. Computing krippendorff’s alpha-reliability.
  8. Nour S Kteily and Alexander P Landry. 2022. Dehumanization: Trends, insights, and challenges. Trends in cognitive sciences.
  9. Pierre Lison and Jörg Tiedemann. 2016. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles.
  10. Hatexplain: A benchmark dataset for explainable hate speech detection. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 14867–14875.
  11. White fear, dehumanization, and low empathy: Lethal combinations for shooting biases. Cultural diversity and ethnic minority psychology, 22(3):322.
  12. A framework for the computational linguistic analysis of dehumanization. Frontiers in artificial intelligence, 3:55.
  13. Tackling online abuse: A survey of automated abuse detection methods. CoRR, abs/1908.06024.
  14. Saif Mohammad. 2018. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long papers), pages 174–184.
  15. ETHOS: an online hate speech detection dataset. CoRR, abs/2006.08328.
  16. The measurement of meaning. 47. University of Illinois press.
  17. Jan Pomikálek. 2011. Justext.
  18. Predicting support for arabs’ autonomy from social dominance: The role of identity complexity and dehumanization. Political Psychology, 37(2):293–301.
  19. A benchmark dataset for learning to intervene in online hate speech. CoRR, abs/1909.04251.
  20. James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology, 39(6):1161.
  21. The measuring hate speech corpus: Leveraging rasch measurement theory for data perspectivism. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pages 83–94.
  22. Analyzing the targets of hate in online social media. In Proceedings of the International AAAI Conference on Web and Social Media, volume 10, pages 687–690.
  23. Content-driven detection of cyberbullying on the instagram social network. In IJCAI, volume 16, pages 3952–3958.

Summary

We haven't generated a summary for this paper yet.