D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation (2404.10857v1)
Abstract: While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within demographic groups may hold diverse values, which can influence their perceptions beyond their group norms. To effectively incorporate these considerations into NLP pipelines, we need datasets with extensive parallel annotations from various social and cultural groups. In this paper we introduce the \dataset dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries, representing eight geo-cultural regions. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values, offering crucial insights for building pluralistic, culturally sensitive NLP models.
- Relationship of subjective and objective social status with psychological and physiological functioning: Preliminary data in healthy, white women. Health psychology, 19(6):586.
- The reasonable effectiveness of diverse evaluation data.
- Crowdsourcing subjective tasks: the case study of understanding toxicity in online discussions. In Companion proceedings of the 2019 world wide web conference, pages 1100–1105.
- DICES dataset: Diversity in conversational ai evaluation for safety. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.
- Lora Aroyo and Chris Welty. 2015. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1):15–24.
- Morality beyond the weird: How the nomological network of morality varies across cultures. Journal of Personality and Social Psychology, 125.
- Jack M Balkin. 2017. Digital speech and democratic culture: A theory of freedom of expression for the information society. In Law and society approaches to cyberspace, pages 325–382. Routledge.
- We need to consider disagreement in evaluation. In Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, Online. Association for Computational Linguistics.
- Valerie C Brannon. 2019. Free speech and the regulation of social media content. Congressional Research Service, 27.
- Hate speech classifiers learn normative social stereotypes. Transactions of the Association for Computational Linguistics, 11:300–319.
- Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics, 10:92–110.
- Accounting for offensive speech as a practice of resistance. In Proceedings of the sixth workshop on online abuse and harms (woah), pages 192–202.
- Crowdworksheets: Accounting for individual and collective identities underlying crowdsourced dataset annotation. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 2342–2351.
- Carsten Eickhoff. 2018. Cognitive biases in crowdsourcing. In Proceedings of the eleventh ACM international conference on web search and data mining, pages 162–170.
- Did they answer? Subjective acts and intents in conversational discourse. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1626–1644, Online. Association for Computational Linguistics.
- Beyond black & white: Leveraging annotator disagreement via soft-label multi-task learning. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2591–2597.
- Large scale crowdsourcing and characterization of twitter abusive behavior. In Proceedings of the international AAAI conference on web and social media, volume 12.
- Incorporating demographic embeddings into language understanding. Cognitive science, 43(1):e12701.
- Jury learning: Integrating dissenting voices into machine learning models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–19.
- Moral foundations theory: The pragmatic validity of moral pluralism. In Advances in experimental social psychology, volume 47, pages 55–130. Elsevier.
- Intersectionality in conversational AI safety: How Bayesian multilevel models help understand diverse perceptions of safety. arXiv preprint arXiv:2306.11530.
- Moral foundations twitter corpus: A collection of 35k tweets annotated for moral sentiment. Social Psychological and Personality Science, 11(8):1057–1071.
- Dirk Hovy. 2015. Demographic factors improve classification performance. In Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), pages 752–762.
- Dirk Hovy and Diyi Yang. 2021. The importance of modeling social factors of language: Theory and practice. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 588–602.
- Jigsaw. 2018. Toxic comment classification challenge. Accessed: 2021-05-01.
- Jigsaw. 2019. Unintended bias in toxicity classification. Accessed: 2021-05-01.
- The (moral) language of hate. PNAS nexus, 2(7):pgad210.
- Constructing interval variables via faceted rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277.
- Confronting abusive language online: A survey from the ethical and human rights perspective. Journal of Artificial Intelligence Research, 71:431–478.
- Designing toxic content classification for a diversity of perspectives. In Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021), pages 299–318.
- Reflections on prejudice and intergroup relations. Current Opinion in Psychology, 11:120–124.
- The ecological fallacy in annotation: Modeling human label variation goes beyond sociodemographics. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1017–1029, Toronto, Canada. Association for Computational Linguistics.
- " is a picture of a bird a bird": Policy recommendations for dealing with ambiguity in machine vision models. arXiv preprint arXiv:2306.15777.
- Ellie Pavlick and Tom Kwiatkowski. 2019. Inherent disagreements in human textual inferences. Transactions of the Association for Computational Linguistics, 7:677–694.
- Learning part-of-speech taggers with inter-annotator agreement loss. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 742–751.
- A framework to assess (dis) agreement among diverse rater groups. arXiv preprint arXiv:2311.05074.
- On releasing annotator-level labels and information in datasets. In Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop, pages 133–138, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Americus Reed II and Karl F Aquino. 2003. Moral identity and the expanding circle of moral regard toward out-groups. Journal of personality and social psychology, 84(6):1270.
- Two contrasting data annotation paradigms for subjective NLP tasks. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 175–190, Seattle, United States. Association for Computational Linguistics.
- Online hate ratings vary by extremes: A statistical analysis. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, pages 213–217.
- Online hate interpretation varies by country, but more by individual: A statistical analysis using crowdsourced ratings. In 2018 fifth international conference on social networks analysis, management and security (snams), pages 88–94. IEEE.
- Do datasets have politics? Disciplinary values in computer vision dataset development. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2):1–37.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models.
- Learning from disagreement: A survey. Journal of Artificial Intelligence Research, 72:1385–1470.
- Adversarial GLUE: A multi-task benchmark for robustness evaluation of language models.
- Zeerak Waseem. 2016. Are you a racist or am I seeing things? Annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science, pages 138–142.
- Disembodied machine learning: On the illusion of objectivity in nlp.
- Ex machina: Personal attacks seen at scale. In Proceedings of the 26th international conference on world wide web, pages 1391–1399.
- Aida Mostafazadeh Davani (13 papers)
- Mark Díaz (26 papers)
- Dylan Baker (5 papers)
- Vinodkumar Prabhakaran (48 papers)