Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora (2401.09333v2)
Abstract: Current methods to identify and classify racist language in text rely on small-n qualitative approaches or large-n approaches focusing exclusively on overt forms of racist discourse. This article provides a step-by-step generalizable guideline to identify and classify different forms of racist discourse in large corpora. In our approach, we start by conceptualizing racism and its different manifestations. We then contextualize these racist manifestations to the time and place of interest, which allows researchers to identify their discursive form. Finally, we apply XLM-RoBERTa (XLM-R), a cross-lingual model for supervised text classification with a cutting-edge contextual understanding of text. We show that XLM-R and XLM-R-Racismo, our pretrained model, outperform other state-of-the-art approaches in classifying racism in large corpora. We illustrate our approach using a corpus of tweets relating to the Ecuadorian ind\'igena community between 2018 and 2021.
- Allport, Gordon W. 1954. The nature of prejudice. Doubleday Garden City, New York.
- Aruguete, Natalia and Ernesto Calvo. 2018. “Time to# protest: Selective exposure, cascading activation, and framing in social media.” Journal of communication 68(3):480–502.
- Beauvais, Edana. 2022. “The political consequences of Indigenous resentment.” Journal of Race, Ethnicity, and Politics 7(1):37–64.
- “?‘ Qué es racismo? Awareness of Racism and Discrimination in Ecuador.” Latin American research review pp. 102–125.
- “The better angels of our nature: How the antiprejudice norm affects policy and party preferences in Great Britain and Germany.” American Journal of Political Science 57(4):841–857.
- Bonfil Batalla, Guillermo. 1977. “El concepto de indio en América: una categoría de la situación colonial.” Boletín Bibliográfico de Antropología Americana (1973-1979) 39(48):17–32.
- Bonilla, Adrián and Mónica Mancero. 2020. ““Venimos a luchaar por el pueblo, no por el poder”: El levantamiento indígena y popular en Ecuador 20191.” Sociología y Política HOY (3):38–47.
- Bonilla-Silva, Eduardo. 2001. White supremacy and racism in the post-civil rights era. Lynne Rienner Publishers.
- Bonilla-Silva, Eduardo. 2002. “The linguistics of color blind racism: How to talk nasty about blacks without sounding “racist”.” Critical Sociology 28(1-2):41–64.
- Bonilla-Silva, Eduardo. 2006. Racism without racists: Color-blind racism and the persistence of racial inequality in the United States. Rowman & Littlefield Publishers.
- Bonilla-Silva, Eduardo. 2015. “The structure of racism in color-blind,“post-racial” America.” American Behavioral Scientist 59(11):1358–1376.
- Bourabain, Dounia and Pieter-Paul Verhaeghe. 2021. “Everyday racism in social science research: A systematic review and future directions.” Du Bois Review 18(2):221–250.
- Calvo, Ernesto and Natalia Aruguete. 2020. Fake news, trolls y otros encantos: Cómo funcionan (para bien y para mal) las redes sociales. Siglo XXI Editores.
- “Winning! Election returns and engagement in social media.” Plos one 18(3):e0281475.
- Canessa, Andrew. 2007. “Who is indigenous? Self-identification, indigeneity, and claims to justice in contemporary Bolivia.” Urban Anthropology and Studies of Cultural Systems and World Economic Development pp. 195–237.
- Canizales, Stephanie L and Jody Agius Vallejo. 2021. “Latinos & racism in the Trump era.” Daedalus 150(2):150–164.
- Chin, William Y. 2015. “The age of covert racism in the era of the roberts court during the waning of affirmative action.” Rutgers Race & L. Rev. 16:1.
- Coates, Rodney D. 2008. “Covert Racism in the USA and Globally.” Sociology Compass 2(1):208–231.
- Colloredo-Mansfeld, Rudi. 1998. “‘Dirty Indians’, radical indígenas, and the political economy of social difference in modern Ecuador.” Bulletin of Latin American Research 17(2):185–205.
- “Unsupervised cross-lingual representation learning at scale.” arXiv preprint arXiv:1911.02116 .
- Cramer, Katherine. 2020. “Understanding the role of racism in contemporary US public opinion.” Annual Review of Political Science 23:153–169.
- DeSante, Christopher D and Candis Watts Smith. 2020. “Fear, institutionalized racism, and empathy: the underlying dimensions of whites’ racial attitudes.” PS: Political Science & Politics 53(4):639–645.
- Díaz, Isabel and Adriana Mejía Artieda. 2020. Las elites en octubre: de ciudadanos indignados a propietarios alarmados. In Octubre y el derecho a la resistencia. CLACSO pp. 271–285.
- Fields, Barbara J. 2001. “Whiteness, racism, and identity.” International Labor and Working-Class History 60:48–56.
- Fields, Barbara J and Karen E Fields. 2022. Racecraft: The soul of inequality in American life. Verso.
- Gertner, Abigail S et al. 2019. MITRE at SemEval-2019 task 5: Transfer learning for multilingual hate speech detection. In Proceedings of the 13th International Workshop on Semantic Evaluation. pp. 453–459.
- Text as data: A new framework for machine learning and the social sciences. Princeton University Press.
- Grosfoguel, Ramon. 2016. “What is racism?” Journal of World-Systems Research 22(1):9–15.
- Guerrero, Andrés. 1997a. “The construction of a Ventriloquist’s image: liberal discourse and the ‘Miserable Indian race’in late 19th-century Ecuador.” Journal of Latin American Studies 29(3):555–590.
- Guerrero, Andrés. 1997b. “Poblaciones indígenas, ciudadanía y representación.” Nueva Sociedad 150:98–105.
- “Race, prejudice and attitudes toward redistribution: A comparative experimental approach.” European Journal of Political Research 55(4):723–744.
- Henry, Patrick J and David O Sears. 2002. “The symbolic racism 2000 scale.” Political psychology 23(2):253–283.
- Herzog, Benno and Arturo Lance Porfillio. 2022. “Talking with racists: insights from discourse and communication studies on the containment of far-right movements.” Humanities and Social Sciences Communications 9(1):1–7.
- Hill, Jane H. 1995. “Junk Spanish, covert racism, and the (leaky) boundary between public and private spheres.” Pragmatics 5(2):197–212.
- Huddy, Leonie and Stanley Feldman. 2009. “On assessing the political effects of racial prejudice.” Annual Review of Political Science 12:423–447.
- Hutchings, Vincent L and Cara Wong. 2014. “Racism, Group Position, and Attitudes About Immigration Among Blacks and Whites.” Du Bois Review: Social Science Research on Race 11(2):419–442.
- Katzew, Ilona. 2005. Casta painting: images of race in eighteenth-century Mexico. Yale University Press.
- Kinder, Donald R and David O Sears. 1981. “Prejudice and politics: Symbolic racism versus racial threats to the good life.” Journal of personality and social psychology 40(3):414.
- Kinder, Donald R and Lynn M Sanders. 1996. Divided by color: Racial politics and democratic ideals. University of Chicago Press.
- Levchak, Charisse C. 2018. Microaggressions, Macroaggressions, and Modern Racism. In Microaggressions and modern racism. Springer pp. 13–69.
- “Roberta: A robustly optimized bert pretraining approach.” arXiv preprint arXiv:1907.11692 .
- Loper, Edward and Steven Bird. 2002. “Nltk: The natural language toolkit.” arXiv cs/0205028 .
- Loshchilov, Ilya and Frank Hutter. 2017. “Decoupled weight decay regularization.” arXiv:1711.05101 .
- Marom, Lilach. 2019. “Under the cloak of professionalism: Covert racism in teacher education.” Race Ethnicity and Education 22(3):319–337.
- Mason, Lilliana. 2016. “A cross-cutting calm: How social sorting drives affective polarization.” Public Opinion Quarterly 80(S1):351–377.
- McConahay, John B. 1986. “Modern racism, ambivalence, and the Modern Racism Scale.”.
- “Advances in pre-training distributed word representations.” arXiv preprint arXiv:1712.09405 .
- Morgan, Stephen L and Christopher Winship. 2015. Counterfactuals and causal inference. Cambridge University Press.
- Oberem, Udo. 1985. “La sociedad indígena durante el Periodo Colonial de Hispanoamérica.” Miscelánea Antropológica Ecuatoriana 5:161–218.
- Painter, Nell Irvin. 2010. The history of white people. WW Norton & Company.
- Pallares, Amalia. 2002. From peasant struggles to Indian resistance: The Ecuadorian Andes in the late twentieth century. University of Oklahoma Press.
- “Comparing pre-trained language models for Spanish hate speech detection.” Expert Systems with Applications 166:114120.
- Pons, Pascal and Matthieu Latapy. 2005. Computing communities in large networks using random walks. In International symposium on computer and information sciences. Springer pp. 284–293.
- “Why Do White Americans oppose race-targeted policies?” Political psychology 30(5):805–828.
- Ravichandiran, S. 2021. Getting Started with Google BERT: Build and train state-of-the-art natural language processing models using BERT. Packt Publishing.
- Rich, Paul B et al. 1990. Race and empire in British politics. CUP Archive.
- Roitman, Karem and Alexis Oviedo. 2017. “Mestizo racism in Ecuador.” Ethnic and racial studies 40(15):2768–2786.
- Schreier, Margrit. 2012. Qualitative content analysis in practice. Sage publications.
- Racialized politics: The debate about racism in America. University of Chicago Press.
- Sears, David O and Patrick J Henry. 2005. “Over thirty years later: A contemporary look at symbolic racism.” Advances in experimental social psychology 37(1):95–125.
- Shoshana, Avihu. 2016. “The language of everyday racism and microaggression in the workplace: Palestinian professionals in Israel.” Ethnic and Racial Studies 39(6):1052–1069.
- “The dynamics of racial resentment across the 50 US states.” Perspectives on Politics 18(2):527–538.
- Telles, Edward and Stanley Bailey. 2013. “Understanding Latin American beliefs about racial inequality.” American Journal of Sociology 118(6):1559–1595.
- Tesler, Michael. 2013. “The return of old-fashioned racism to White Americans’ partisan preferences in the early Obama era.” The Journal of Politics 75(1):110–123.
- Traverso-Yépez, Martha. 2005. “Discursos racistas: institucionalización del racismo a través de las prácticas lingüísticas.” Revista Interamericana de Psicología 39(1):61–70.
- Natural Language Processing with Transformers. O’Reilly.
- Vallejo Vera, Sebastián. 2023. “Rage in the Machine: Activation of Racist Content in Social Media.” Latin American Politics and Society 65(1):74–100.
- Van Dijk, Teun A. 2009. Racism and discourse in Latin America: An introduction. In Racism and discourse in Latin America, ed. Teun A Van Dijk. Rowman & Littlefield Publishers pp. 4–13.
- Attention is all you need. In Advances in neural information processing systems. pp. 5998–6008.
- Whitten Jr, Norman E. 2003. “Symbolic inversion, the topology of El Mestizaje, and the spaces of Las Razasin Ecuador.” Journal of Latin American Anthropology 8(1):52–85.
- Wright, Michelle. 2004. Becoming black: Creating identity in the African diaspora. Duke U. Press.
- Zhang, Ziqi and Lei Luo. 2019. “Hate speech detection: A solved problem? the challenging case of long tail on twitter.” Semantic Web 10(5):925–945.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.