Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation (2402.06811v1)

Published 9 Feb 2024 in cs.AI

Abstract: Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from aspects of annotation work. This paper outlines a critical genealogy of data annotation; starting with its psychological and perceptual aspects. We draw on similarities with critiques of the rise of computerized lab-based psychological experiments in the 1970's which question whether these experiments permit the generalization of results beyond the laboratory settings within which these results are typically obtained. Do data annotations permit the generalization of results beyond the settings, or locations, in which they were obtained? Psychology is overly reliant on participants from Western, Educated, Industrialized, Rich, and Democratic societies (WEIRD). Many of the people who work as data annotation platform workers, however, are not from WEIRD countries; most data annotation workers are based in Global South countries. Social categorizations and classifications from WEIRD countries are imposed on non-WEIRD annotators through instructions and tasks, and through them, on data, which is then used to train or evaluate AI models in WEIRD countries. We synthesize evidence from several recent lines of research and argue that data annotation is a form of automated social categorization that risks entrenching outdated and static social categories that are in reality dynamic and changing. We propose a framework for understanding the interplay of the global social conditions of data annotation with the subjective phenomenological experience of data annotation work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (106)
  1. Identifying and measuring annotator bias based on annotators’ demographic characteristics. In Proceedings of the fourth workshop on online abuse and harms. 184–190.
  2. Linda Alcoff. 1996. Real knowing: New versions of the coherence theory. Cornell University Press.
  3. Linda Martín Alcoff. 2005. Visible identities: Race, gender, and the self. Oxford University Press.
  4. Mel Andrews. 2023. The Devil in the Data: Machine Learning & the Theory-Free Ideal. (2023).
  5. Arthur Aron and Elaine N Aron. 1999. Statistics for psychology. Prentice-Hall, Inc.
  6. DICES Dataset: Diversity in Conversational AI Evaluation for Safety. arXiv preprint arXiv:2306.11247 (2023).
  7. Lora Aroyo and Chris Welty. 2015. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine 36, 1 (2015), 15–24.
  8. Which humans? (2023).
  9. Toward a perspectivist turn in ground truthing for predictive computing. arXiv preprint arXiv:2109.04270 (2021).
  10. Zygmunt Bauman. 2013. Identity: Coversations with benedetto vecchi. John Wiley & Sons.
  11. Ruha Benjamin. 2023. Race after technology. In Social Theory Re-Wired. Routledge, 405–415.
  12. Assessing LLMs for Moral Value Pluralism. arXiv preprint arXiv:2312.10075 (2023).
  13. The values encoded in machine learning research. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 173–184.
  14. Florian J Boge. 2022. Two dimensions of opacity and the deep learning predicament. Minds and Machines 32, 1 (2022), 43–75.
  15. Pierre Bourdieu. 1990. The logic of practice. Stanford university press.
  16. Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. PMLR, 77–91.
  17. Kate Crawford and Vladan Joler. 2018. Anatomy of an AI System. Anatomy of an AI System (2018).
  18. Peter Cryle and Elizabeth Stephens. 2019. Normality: A critical genealogy. University of Chicago Press.
  19. Hate Speech Classifiers Learn Normative Social Stereotypes. Transactions of the Association for Computational Linguistics 11 (2023), 300–319.
  20. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics 10 (2022), 92–110.
  21. Guilherme Sanches de Oliveira and Edward Baggs. 2023. Psychology’s WEIRD Problems. Cambridge University Press.
  22. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248–255.
  23. Whose ground truth? accounting for individual and collective identities underlying dataset annotation. arXiv preprint arXiv:2112.04554 (2021).
  24. On the genealogy of machine learning datasets: A critical history of ImageNet. Big Data & Society 8, 2 (2021), 20539517211035955.
  25. Semi-automated data labeling. In NeurIPS 2020 Competition and Demonstration Track. PMLR, 156–169.
  26. Kate Devlin. 2023. Power in AI: Inequality Within and Without the Algorithm. The Handbook of Gender, Communication, and Women% 27s Human Rights (2023), 123–139.
  27. Accounting for offensive speech as a practice of resistance. In Proceedings of the sixth workshop on online abuse and harms (woah). 192–202.
  28. Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–14.
  29. Crowdworksheets: Accounting for individual and collective identities underlying crowdsourced dataset annotation. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 2342–2351.
  30. Catherine D’ignazio and Lauren F Klein. 2023. Data feminism. MIT press.
  31. Jean-Pierre Dupuy. 2009. On the origins of cognitive science: The mechanization of the mind. Mit Press.
  32. Paul M Fitts. 1954. The information capacity of the human motor system in controlling the amplitude of movement. Journal of experimental psychology 47, 6 (1954), 381.
  33. Michel Foucoult. 1975. Discipline and punish. A. Sheridan, Tr., Paris, FR, Gallimard (1975).
  34. Seb Franklin. 2021. The digitally disposed: Racial capitalism and the informatics of value. Vol. 61. U of Minnesota Press.
  35. Kajsa Ekholm Friedman and Jonathan Friedman. 2008. Modernities, class, and the contradictions of globalization: The anthropology of global systems. Rowman Altamira.
  36. Toward Transformer-Based NLP for Extracting Psychosocial Indicators of Moral. In Proceedings of the Annual Meeting of the Cognitive Science Society, 43 (43).
  37. Dave Gershgorn. 2017. The data that transformed AI research—and possibly the world. Quartz, July 26, 2013-2017 (2017), 52.
  38. Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets. arXiv preprint arXiv:1908.07898 (2019).
  39. Measuring social biases of crowd workers using counterfactual queries. arXiv preprint arXiv:2004.02028 (2020).
  40. Gerd Gigerenzer. 1989. The empire of chance: How probability changed science and everyday life. Number 12. Cambridge University Press.
  41. Lisa Gitelman. 2013. Raw data is an oxymoron. MIT press.
  42. Mary L Gray and Siddharth Suri. 2019. Ghost work: How to stop Silicon Valley from building a new global underclass. Eamon Dolan Books.
  43. Ian Hacking. 2013. Making up people. In Forms of desire. Routledge, 69–88.
  44. The unreasonable effectiveness of data. IEEE intelligent systems 24, 2 (2009), 8–12.
  45. Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 501–512.
  46. Sandra G Harding. 2004. The feminist standpoint theory reader: Intellectual and political controversies. Psychology Press.
  47. Sally Haslanger. 2012. Resisting reality: Social construction and social critique. Oxford University Press.
  48. Joseph Henrich. 2020. The WEIRDest people in the world: How the West became psychologically peculiar and particularly prosperous. Penguin UK.
  49. David Hume. 2000. A treatise of human nature. Oxford University Press.
  50. Florian Jaton. 2021. The constitution of algorithms: Ground-truthing, programming, formulating. MIT Press.
  51. Gabbrielle M Johnson. 2023. Are Algorithms Value-Free?: Feminist Theoretical Virtues in Machine Learning. Journal of Moral Philosophy 1, aop (2023), 1–35.
  52. The Ghost in the Machine has an American accent: value conflict in GPT-3. arXiv preprint arXiv:2203.07785 (2022).
  53. Michael N Jones. 2016. Big data in cognitive science. Psychology Press.
  54. A hunt for the Snark: Annotator Diversity in Data Practices. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–15.
  55. Svetlana Kiritchenko and Saif M Mohammad. 2018. Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint arXiv:1805.04508 (2018).
  56. Mining big data to extract patterns and predict real-life outcomes. Psychological methods 21, 4 (2016), 493.
  57. George Lakoff. 2008. Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago press.
  58. The problem with annotation. Human labour and outsourcing between France and Madagascar. Big Data & Society 10, 2 (2023), 20539517231188723.
  59. Helen E Longino. 1990. Science as social knowledge: Values and objectivity in scientific inquiry. Princeton university press.
  60. Sophia Maalsen. 2023. Algorithmic epistemologies and methodologies: Algorithmic harm, algorithmic care and situated algorithmic knowledges. Progress in Human Geography 47, 2 (2023), 197–214.
  61. Søren Mau. 2023. Mute Compulsion: A Marxist Theory of the Economic Power of Capital. Verso Books.
  62. Robert R McCrae and Antonio Terracciano. 2005. Universal features of personality traits from the observer’s perspective: data from 50 cultures. Journal of personality and social psychology 88, 3 (2005), 547.
  63. José Medina. 2012. The epistemology of resistance: Gender and racial oppression, epistemic injustice, and resistant imaginations. Oxford University Press.
  64. Milagros Miceli and Julian Posada. 2022. The Data-Production Dispositif. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–37.
  65. Studying up machine learning data: Why talk about bias when we mean power? Proceedings of the ACM on Human-Computer Interaction 6, GROUP (2022), 1–14.
  66. Between subjectivity and imposition: Power dynamics in data annotation for computer vision. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (2020), 1–25.
  67. Foundations of machine learning. MIT press.
  68. Ellis P Monk Jr. 2022. Inequality without groups: Contemporary theories of categories, intersectional typicality, and the disaggregation of difference. Sociological Theory 40, 1 (2022), 3–27.
  69. Jochen Musch and Ulf-Dietrich Reips. 2000. A brief history of Web experimenting. In Psychological experiments on the Internet. Elsevier, 61–87.
  70. Beyond Western, Educated, Industrial, Rich, and Democratic (WEIRD) psychology: Measuring and mapping scales of cultural and psychological distance. Psychological science 31, 6 (2020), 678–701.
  71. Safiya Umoja Noble. 2018. Algorithms of oppression. In Algorithms of oppression. New York university press.
  72. Cathy O’neil. 2017. Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
  73. Don’t Blame the Annotator: Bias Already Starts in the Annotation Instructions. arXiv preprint arXiv:2205.00415 (2022).
  74. Annotating Twitter Data from Vulnerable Populations: Evaluating Disagreement Between Domain Experts and Graduate Student Annotators. In Proceedings of the 52nd Hawaii International Conference on System Sciences. 2142–2151.
  75. Data and its (dis) contents: A survey of dataset development and use in machine learning research. Patterns 2, 11 (2021).
  76. Jamie Peck. 2017. Offshore: Exploring the worlds of global outsourcing. Oxford University Press.
  77. Linguistically debatable or just plain wrong?. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 507–511.
  78. Karl Popper. 2013. Realism and the aim of science: From the postscript to the logic of scientific discovery. Routledge.
  79. Attention and cognitive control. Cognitive psychology: Key readings 205 (2004), 55–85.
  80. On releasing annotator-level labels and information in datasets. arXiv preprint arXiv:2110.05699 (2021).
  81. Vinodkumar Prabhakaran and Donald Martin Jr. 2020. Participatory machine learning using community-based system dynamics. Health and Human Rights 22, 2 (2020), 71.
  82. Stathis Psillos. 2007. Philosophy of science AZ. Edinburgh University Press.
  83. Stathis Psillos. 2015. Evidence: wanted, alive or dead. Canadian Journal of Philosophy 45, 3 (2015), 357–381.
  84. AI’s Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 506–517.
  85. Isaac Ariail Reed. 2011. Interpretation and social knowledge: On the use of theory in the human sciences. University of Chicago Press.
  86. Sarah T Roberts. 2019. Behind the screen. Yale University Press.
  87. Two contrasting data annotation paradigms for subjective NLP tasks. arXiv preprint arXiv:2112.07475 (2021).
  88. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. In proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
  89. Annotators with attitudes: How annotator beliefs and identities bias toxic language detection. arXiv preprint arXiv:2111.07997 (2021).
  90. Safe spaces and safe places: Unpacking technology-mediated experiences of safety and harm with transgender people. Proceedings of the ACM on Human-computer Interaction 2, CSCW (2018), 1–27.
  91. A step toward more inclusive people annotations for fairness. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 916–925.
  92. James C Scott. 2020. Seeing like a state: How certain schemes to improve the human condition have failed. yale university Press.
  93. Fairness and abstraction in sociotechnical systems. In Proceedings of the conference on fairness, accountability, and transparency. 59–68.
  94. Turkers, Scholars, ”Arafat” and ”Peace”: Cultural Communities and Algorithmic Gold Standards. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (Vancouver, BC, Canada) (CSCW ’15). Association for Computing Machinery, New York, NY, USA, 826–838. https://doi.org/10.1145/2675133.2675285
  95. WEIRD FAccTs: How Western, Educated, Industrialized, Rich, and Democratic is FAccT?. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 160–171.
  96. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 723–741.
  97. Galit Shmueli. 2010. To explain or to predict? (2010).
  98. Why reliabilism is not enough: Epistemic and moral justification in machine learning. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 372–377.
  99. Robert C Solomon and Kathleen M Higgins. 2012. What Nietzsche really said. Schocken.
  100. Gwyneth Sutherlin. 2023. Who is the human in the machine? Releasing the human–machine metaphor from its cultural roots can increase innovation and equity in AI. AI and Ethics (2023), 1–8.
  101. Asmita Bhutani Vij. 2023. Women Workers Behind the AI Revolution: The Production and Reproduction of Data Annotation Platforms. Ph. D. Dissertation. University of Toronto (Canada).
  102. Whose AI Dream? In search of the aspiration in data annotation.. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–16.
  103. Zeerak Waseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science. 138–142.
  104. Gregory Wheeler. 2016. Machine epistemology and big data. In The Routledge companion to philosophy of social science. Routledge, 341–349.
  105. Iris Marion Young. 2010. Responsibility for justice. Oxford University Press.
  106. Disembodied machine learning: On the illusion of objectivity in NLP. (2020).
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com