A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions (2401.12720v1)
Abstract: Language is a dynamic aspect of our culture that changes when expressed in different technologies/communities. Online social networks have enabled the diffusion and evolution of different dialects, including African American English (AAE). However, this increased usage is not without barriers. One particular barrier is how sentiment (Vader, TextBlob, and Flair) and toxicity (Google's Perspective and the open-source Detoxify) methods present biases towards utterances with AAE expressions. Consider Google's Perspective to understand bias. Here, an utterance such as All n*ggers deserve to die respectfully. The police murder us.'' it reaches a higher toxicity than
African-Americans deserve to die respectfully. The police murder us.''. This score difference likely arises because the tool cannot understand the re-appropriation of the term ``n*gger''. One explanation for this bias is that AI models are trained on limited datasets, and using such a term in training data is more likely to appear in a toxic utterance. While this may be plausible, the tool will make mistakes regardless. Here, we study bias on two Web-based (YouTube and Twitter) datasets and two spoken English datasets. Our analysis shows how most models present biases towards AAE in most settings. We isolate the impact of AAE expression usage via linguistic control features from the Linguistic Inquiry and Word Count (LIWC) software, grammatical control features extracted via Part-of-Speech (PoS) tagging from NLP models, and the semantic of utterances by comparing sentence embeddings from recent LLMs. We present consistent results on how a heavy usage of AAE expressions may cause the speaker to be considered substantially more toxic, even when speaking about nearly the same subject. Our study complements similar analyses focusing on small datasets and/or one method only.
- Large language models associate Muslims with violence. Nature Machine Intelligence 3, 6 (2021), 461–463.
- CJ Adams. 2018. New York Times: Using AI to host better conversations. https://blog.google/technology/ai/new-york-times-using-ai-host-better-conversations/.
- FLAIR: An easy-to-use framework for state-of-the-art NLP. In NAACL 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 54–59.
- Quantifying gender bias in different corpora. In Companion Proceedings of the Web Conference 2020. 752–759.
- Arnetha F Ball. 1992. Cultural preference and the expository writing of African-American adolescents. Written Communication 9, 4 (1992), 501–532.
- Differential tweetment: Mitigating racial dialect bias in harmful tweet detection. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 116–128.
- Distributed representations of geographically situated language. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 828–834.
- John Baugh. 1981. Runnin’down Some Lines: The Language and Culture of Black Teenagers.
- Demographic dialectal variation in social media: A case study of African-American English. arXiv preprint arXiv:1608.08868 (2016).
- Twitter universal dependency parsing for African-American and mainstream American English. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1415–1425.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems 29 (2016).
- The development and psychometric properties of LIWC-22. Austin, TX: University of Texas at Austin (2022), 1–47.
- Fairness testing: A comprehensive survey and analysis of trends. arXiv preprint arXiv:2207.10223 (2022).
- Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, Vol. 11. 512–515.
- Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–14.
- Joey Lee Dillard. 1977. Lexicon of Black English. ERIC.
- Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 67–73.
- Diffusion of lexical change in social media. PloS one 9, 11 (2014), e113114.
- A Survey of Race, Racism, and Anti-Racism in NLP. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1905–1925.
- Sarah Florini. 2014. Tweets, Tweeps, and Signifyin’ Communication and Cultural Performance on “Black Twitter”. Television & New Media 15, 3 (2014), 223–237.
- Patricia Friedrich. 2020. When Englishes go digital. World Englishes 39, 1 (2020), 67–78.
- Patricia Friedrich and Eduardo Diniz de Figueiredo. 2016. The sociolinguistics of digital Englishes. Routledge.
- Gender asymmetries in reality and fiction: The bechdel test of social media. In Eighth International AAAI Conference on Weblogs and Social Media.
- Anastasia Giachanou and Fabio Crestani. 2016. Like it or not: A survey of twitter sentiment analysis methods. ACM Computing Surveys (CSUR) 49, 2 (2016), 1–41.
- Twitter sentiment classification using distant supervision. CS224N project report, Stanford 1, 12 (2009), 2009.
- Drag queens and artificial intelligence. Should computers decide what is toxic on the internet. Internet Lab blog (2019).
- Hila Gonen and Yoav Goldberg. 2019. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv preprint arXiv:1903.03862 (2019).
- Uneven geographies of user-generated information: Patterns of increasing informational poverty. Annals of the Association of American Geographers 104, 4 (2014), 746–764.
- Lisa J Green. 2002. African American English: a linguistic introduction. Cambridge University Press.
- All you need is” love” evading hate speech detection. In Proceedings of the 11th ACM workshop on artificial intelligence and security. 2–12.
- Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify.
- Exploring the role of grammar and word choice in bias toward african american english (aae) in hate speech classification. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 789–798.
- Deceiving google’s perspective api built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017).
- Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. 168–177.
- Unintended machine learning biases as social barriers for persons with disabilitiess. ACM SIGACCESS Accessibility and Computing (2020), 1–1.
- Clayton Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, Vol. 8. 216–225.
- Measuring gender bias in news images. In Proceedings of the 24th International Conference on World Wide Web. 893–898.
- Jigsaw. [n. d.]. Perspective API. https://perspectiveapi.com/. Accessed: 2023-01-30.
- Jigsaw. 2019. How Latin America’s Second Largest Social Platform Moderates More than 150K Comments a Month. https://medium.com/jigsaw/how-latin-americas-second-largest-social-platform-moderates-more-than-150k-comments-a-month-df0d8a3ac242.
- Tyler Kendall and Charlie Farrington. 2021. The Corpus of Regional African American Language (Version 2021.07). Eugene, OR: The Online Resources for African American Language Project.
- Svetlana Kiritchenko and Saif M Mohammad. 2018. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. NAACL HLT 2018 (2018), 43.
- Animesh Koratana and Kevin Hu. 2018. Toxic speech detection. URL: https://web. stanford. edu/class/archive/cs/cs224n/cs224n 1194 (2018).
- Designing Toxic Content Classification for a Diversity of Perspectives.. In SOUPS@ USENIX Security Symposium. 299–318.
- Steven Loria. 2018. textblob Documentation. Release 0.15 2 (2018).
- Patricia Georgiou Marie Pellat. 2018. Perspective Launches In Spanish With El País. https://medium.com/jigsaw/perspective-launches-in-spanish-with-el-pa%C3%ADs-dc2385d734b2.
- Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.
- A Human-Centered Evaluation of a Toxicity Detection API: Testing Transferability and Unpacking Latent Attributes. ACM Transactions on Social Computing (2023).
- Lisa Nakamura. 2013. Cybertypes: Race, ethnicity, and identity on the Internet. Routledge.
- Distinguishing the popularity between topics: a system for up-to-date opinion retrieval and mining in the web. In Computational Linguistics and Intelligent Text Processing: 14th International Conference, CICLing 2013, Samos, Greece, March 24-30, 2013, Proceedings, Part II 14. Springer, 197–209.
- Daniel Borkan Patricia Georgiou, Marie Pellat. 2019. Parlons-en! Perspective and Tune are now available in French. https://medium.com/jigsaw/perspective-tune-are-now-available-in-french-c4cf1ca198f2.
- Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71, 2001 (2001), 2001.
- The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability. Speech Communication 45, 1 (2005), 89–95.
- Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science 5, 1 (2016), 1–29.
- Internet. Our World in Data (2015). https://ourworldindata.org/internet.
- The risk of racial bias in hate speech detection. In Proceedings of the 57th annual meeting of the association for computational linguistics. 1668–1678.
- Geneva Smitherman. 2000. Black talk: Words and phrases from the hood to the amen corner. Houghton Mifflin Harcourt.
- Boosting image sentiment analysis with visual attention. Neurocomputing 312 (2018), 218–228.
- Astraea: Grammar-based fairness testing. IEEE Transactions on Software Engineering 48, 12 (2022), 5188–5211.
- Lexicon-based methods for sentiment analysis. Computational linguistics 37, 2 (2011), 267–307.
- Rachael Tatman. 2017. Gender and dialect bias in YouTube’s automatic captions. In Proceedings of the first ACL workshop on ethics in natural language processing. 53–59.
- Mike Thelwall. 2014. Heart and soul: Sentiment strength detection in the social web with sentistrength, 2017. Cyberemotions: Collective emotions in cyberspace (2014).
- Pranav Narayanan Venkit and Shomir Wilson. 2021. Identification of bias against people with disabilities in sentiment analysis and toxicity detection models. arXiv preprint arXiv:2111.13259 (2021).
- A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In Proceedings of the ACL 2012 system demonstrations. 115–120.
- Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Advances in Neural Information Processing Systems 33 (2020), 5776–5788.
- Maciej Widawski. 2015. African American slang: A linguistic description. Cambridge University Press.
- OpinionFinder: A system for subjectivity analysis. In Proceedings of HLT/EMNLP 2005 Interactive Demonstrations. 34–35.
- RECAST: Enabling user recourse and interpretability of toxicity detection models with interactive visualization. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–26.
- Current state of text sentiment analysis from opinion to emotion mining. ACM Computing Surveys (CSUR) 50, 2 (2017), 1–33.
- Feature-enhanced attention network for target-dependent sentiment classification. Neurocomputing 307 (2018), 91–97.
- Guilherme H. Resende (1 paper)
- Luiz F. Nery (1 paper)
- Fabrício Benevenuto (64 papers)
- Savvas Zannettou (55 papers)
- Flavio Figueiredo (17 papers)