Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DICES Dataset: Diversity in Conversational AI Evaluation for Safety (2306.11247v1)

Published 20 Jun 2023 in cs.HC

Abstract: Machine learning approaches often require training and evaluation datasets with a clear separation between positive and negative examples. This risks simplifying and even obscuring the inherent subjectivity present in many tasks. Preserving such variance in content and diversity in datasets is often expensive and laborious. This is especially troubling when building safety datasets for conversational AI systems, as safety is both socially and culturally situated. To demonstrate this crucial aspect of conversational AI safety, and to facilitate in-depth model performance analyses, we introduce the DICES (Diversity In Conversational AI Evaluation for Safety) dataset that contains fine-grained demographic information about raters, high replication of ratings per item to ensure statistical power for analyses, and encodes rater votes as distributions across different demographics to allow for in-depth explorations of different aggregation strategies. In short, the DICES dataset enables the observation and measurement of variance, ambiguity, and diversity in the context of conversational AI safety. We also illustrate how the dataset offers a basis for establishing metrics to show how raters' ratings can intersects with demographic categories such as racial/ethnic groups, age groups, and genders. The goal of DICES is to be used as a shared resource and benchmark that respects diverse perspectives during safety evaluation of conversational AI systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Modeling annotator perspective and polarized opinions to improve hate speech detection. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 8, pages 151–154, 2020.
  2. Whose opinions matter? perspective-aware models to identify opinions of hate speech victims in abusive language detection. arXiv preprint arXiv:2106.15896, 2021.
  3. Identifying and measuring annotator bias based on annotators’ demographic characteristics. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 184–190, 2020.
  4. A literature review of textual hate speech detection methods and datasets. Information, 13(6):273, 2022.
  5. Ground-truth, whose truth? – examining the challenges with annotating toxic text datasets, 2021.
  6. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1):15–24, 2015.
  7. Toward a perspectivist turn in ground truthing for predictive computing, 2021.
  8. Toward a perspectivist turn in ground truthing for predictive computing. arXiv preprint arXiv:2109.04270, 2021.
  9. A drop of ink may make a million think: The spread of false information in large language models. arXiv preprint arXiv:2305.04812, 2023.
  10. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
  11. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics, 10:92–110, 2022.
  12. Recent advances towards safe, responsible, and moral dialogue systems: A survey, 2023.
  13. Whose ground truth? accounting for individual and collective identities underlying dataset annotation. arXiv preprint arXiv:2112.04554, 2021.
  14. Anticipating safety issues in e2e conversational ai: Framework and tooling. arXiv preprint arXiv:2107.03451, 2021.
  15. Safetykit: First aid for measuring safety in open-domain conversational systems. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2022.
  16. Handling bias in toxic speech detection: A survey. ACM Computing Surveys, 2022.
  17. Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation. arXiv preprint arXiv:2205.00501, 2022.
  18. Mitigating racial biases in toxic language detection with an equity-based ensemble framework. In Equity and Access in Algorithms, Mechanisms, and Optimization, EAAMO ’21, New York, NY, USA, 2021. Association for Computing Machinery.
  19. Unsolved problems in ml safety, 2022.
  20. Is chatgpt better than human annotators? potential and limitations of chatgpt in explaining implicit hate speech. In Companion Proceedings of the ACM Web Conference 2023, WWW ’23 Companion, page 294–297, New York, NY, USA, 2023. Association for Computing Machinery.
  21. Florian Jaton. Assessing biases, relaxing moralism: On ground-truthing practices in machine learning design and application. Big Data and Society, 8(1), 2021.
  22. Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach. Information Processing and Management, 58(5):102643, 2021.
  23. Learning personal human biases and representations for subjective tasks in natural language processing. In 2021 IEEE International Conference on Data Mining (ICDM), pages 1168–1173, 2021.
  24. Mitigating toxic degeneration with empathetic data: Exploring the relationship between toxicity and empathy. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4926–4938, Seattle, United States, July 2022. Association for Computational Linguistics.
  25. Agreeing to disagree: Annotating offensive language datasets with annotators’ disagreement. arXiv preprint arXiv:2109.13563, 2021.
  26. Hatexplain: A benchmark dataset for explainable hate speech detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14867–14875, 2021.
  27. Toxicity detection: Does context really matter? arXiv preprint arXiv:2006.00998, 2020.
  28. Red teaming language models with language models, 2022.
  29. On releasing annotator-level labels and information in datasets. arXiv preprint arXiv:2110.05699, 2021.
  30. Open-domain conversational agents: Current progress, open problems, and future directions, 2020.
  31. Two contrasting data annotation paradigms for subjective nlp tasks. arXiv preprint arXiv:2112.07475, 2021.
  32. Conversational ai: Social and ethical considerations. In AICS, pages 104–115, 2019.
  33. Detecting unintended social bias in toxic language datasets, 2022.
  34. Why don’t you do it right? analysing annotators’ disagreement in subjective tasks. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2420–2433, 2023.
  35. Whose opinions do language models reflect? arXiv preprint arXiv:2303.17548, 2023.
  36. Annotators with attitudes: How annotator beliefs and identities bias toxic language detection, 2022.
  37. Why so toxic? measuring and triggering toxic behavior in open-domain chatbots. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS ’22, page 2659–2673, New York, NY, USA, 2022. Association for Computing Machinery.
  38. Process for adapting language models to society (palms) with values-targeted datasets. Advances in Neural Information Processing Systems, 34:5861–5873, 2021.
  39. On the safety of conversational models: Taxonomy, dataset, and benchmark, 2022.
  40. Lamda: Language models for dialog applications, 2022.
  41. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  42. Investigating annotator bias with a graph-based approach. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 191–199, Online, November 2020. Association for Computational Linguistics.
  43. Investigating annotator bias in abusive language datasets. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1515–1525, 2021.
  44. Teach me to explain: A review of datasets for explainable natural language processing. In J. Vanschoren and S. Yeung, editors, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1. Curran, 2021.
  45. Toxicity detection sensitive to conversational context. First Monday, 27(5), Sep. 2022.
  46. Recipes for safety in open-domain chatbots, 2021.
  47. Adversarial training for high-stakes reliability, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Lora Aroyo (35 papers)
  2. Alex S. Taylor (2 papers)
  3. Christopher M. Homan (22 papers)
  4. Alicia Parrish (31 papers)
  5. Vinodkumar Prabhakaran (48 papers)
  6. Ding Wang (71 papers)
  7. Mark Diaz (10 papers)
  8. Greg Serapio-Garcia (2 papers)
Citations (25)
X Twitter Logo Streamline Icon: https://streamlinehq.com