Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TIDE: Textual Identity Detection for Evaluating and Augmenting Classification and Language Models (2309.04027v2)

Published 7 Sep 2023 in cs.CL and cs.LG

Abstract: Machine learning models can perpetuate unintended biases from unfair and imbalanced datasets. Evaluating and debiasing these datasets and models is especially hard in text datasets where sensitive attributes such as race, gender, and sexual orientation may not be available. When these models are deployed into society, they can lead to unfair outcomes for historically underrepresented groups. In this paper, we present a dataset coupled with an approach to improve text fairness in classifiers and LLMs. We create a new, more comprehensive identity lexicon, TIDAL, which includes 15,123 identity terms and associated sense context across three demographic categories. We leverage TIDAL to develop an identity annotation and augmentation tool that can be used to improve the availability of identity context and the effectiveness of ML fairness techniques. We evaluate our approaches using human contributors, and additionally run experiments focused on dataset and model debiasing. Results show our assistive annotation technique improves the reliability and velocity of human-in-the-loop processes. Our dataset and methods uncover more disparities during evaluation, and also produce more fair models during remediation. These approaches provide a practical path forward for scaling classifier and generative model fairness in real-world settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.
  2. Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pages 298–306.
  3. Random walks for knowledge-based word sense disambiguation. Computational Linguistics, 40(1):57–84.
  4. Keith Allan. 2007. The pragmatics of connotation. Journal of Pragmatics, 39:1047–1057.
  5. Emily Allaway and Kathleen McKeown. 2021. A unified feature representation for lexical connotations. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2145–2163, Online. Association for Computational Linguistics.
  6. What we can’t measure, we can’t understand: Challenges to demographic data procurement in the pursuit of fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, page 249–260.
  7. McKane Andrus and Sarah Villeneuve. 2022. Demographic-reliant algorithmic fairness: Characterizing the risks of demographic data collection in the pursuit of fairness. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, page 1709–1721.
  8. Appen. a. Guide to: Contributors channels page. [Online; accessed 28-June-2023].
  9. Appen. b. Guide to: Designating test questions for quiz or work mode with gold pool. [Online; accessed 28-June-2023].
  10. Appen. c. Guide to: Fair pay. [Online; accessed 28-June-2023].
  11. Appen. d. Machine learning assisted text utterance collection. [Online; accessed 28-June-2023].
  12. AI fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. Computing Research Repository, abs/1810.01943.
  13. Fairlearn: A toolkit for assessing and improving fairness in AI. Technical Report MSR-TR-2020-32, Microsoft.
  14. Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc.
  15. Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, Online. Association for Computational Linguistics.
  16. Nuanced metrics for measuring unintended bias with real data for text classification. In Companion Proceedings of The 2019 World Wide Web Conference, pages 491–500.
  17. Marine Carpuat. 2015. Connotation in translation. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 9–15, Lisboa, Portugal. Association for Computational Linguistics.
  18. Fair classification with noisy protected attributes: A framework with provable guarantees. In Proceedings of the 38th International Conference on Machine Learning, volume 139, pages 1349–1361. PMLR.
  19. Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, page 67–73.
  20. UBY-LMF – a uniform model for standardizing heterogeneous lexical-semantic resources in ISO-LMF. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pages 275–282, Istanbul, Turkey. European Language Resources Association (ELRA).
  21. International Organization for Standardization. 2022. Language resource management — lexical markup framework (lmf) — part 5: Lexical base exchange (lbx) serialization. ISO Standard No. 24613-5:2022.
  22. Counterfactual fairness in text classification through robustness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 219–226.
  23. Conflict and mediation event observations (cameo): A new event data framework for the analysis of foreign policy interactions. International Studies Association. Paper presented at the 43rd Annual Convention of the International Studies Association, New Orleans, March 2002.
  24. GLAAD. Glossary of terms: Lgbtq - glaad. [Online; accessed 28-June-2023].
  25. Intrinsic bias metrics do not correlate with application bias. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1926–1940, Online. Association for Computational Linguistics.
  26. Proxy fairness. Computing Research Repository, abs/1806.11212.
  27. Kilem L Gwet. 2014. Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters. Advanced Analytics, LLC.
  28. Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear.
  29. HRC Foundation. Glossary of terms - human rights campaign. [Online; accessed 28-June-2023].
  30. Jigsaw. 2019. Jigsaw unintended bias in toxicity classification. [Online; accessed 28-June-2023].
  31. Learning fair classifiers with partially annotated group labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10348–10357.
  32. Klaus Krippendorff. 1970. Estimating the reliability, systematic error and random error of interval data. Educational and psychological measurement, 30(1):61–70.
  33. Issues and best practices in content analysis. Journalism & Mass Communication Quarterly, 92(4):791–811.
  34. Fairness without demographics through adversarially reweighted learning. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 728–740.
  35. ‘emerging proxies’ in information-rich machine learning: a threat to fairness? In 2023 IEEE International Symposium on Ethics in Engineering, Science, and Technology (ETHICS), pages 1–1.
  36. Ann Morning. 2008. Ethnic classification in global perspective: A cross-national survey of the 2000 census round. Population Research and Policy Review, 27:239–272.
  37. Alok Ranjan Pal and Diganta Saha. 2015. Word sense disambiguation: A survey. International Journal of Control Theory and Computer Modeling (IJCTCM), 5(3).
  38. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
  39. Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 300–325, Online. Association for Computational Linguistics.
  40. Measuring the reliability of hate speech annotations: The case of the european refugee crisis. In NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication, pages 6–9. Ruhr-Universitat Bochum.
  41. Aequitas: A bias and fairness audit toolkit. Computing Research Repository, abs/1811.05577.
  42. Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5248–5264, Online. Association for Computational Linguistics.
  43. “I’m sorry to hear that”: Finding new biases in language models with a holistic descriptor dataset. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9180–9211, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  44. Latanya Sweeney. 2013. Discrimination in online ad delivery. Communications of the ACM, 56(5):44–54.
  45. Albert Trithart. 2021. A un for all?: Un policy and programming on sexual orientation, gender identity and expression, and sex characteristics. Technical report, International Peace Institute.
  46. Michael Carl Tschantz. 2022. What is proxy discrimination? In 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 1993–2003, New York, NY, USA. Association for Computing Machinery.
  47. UNSD. 2003. Ethnicity: A review of data collection and dissemination. Technical report, United Nations Statistics Division.
  48. Fairness for text classification tasks with identity information data augmentation methods. Computing Research Repository, abs/2203.03541. Paper presented at the Measures and Best Practices for Responsible AI workshop at the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual/Singapore, August 2021.
  49. Are “undocumented workers” the same as “illegal aliens”? Disentangling denotation and connotation in vector spaces. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4090–4105, Online. Association for Computational Linguistics.
  50. Wikipedia contributors. 2023. List of adjectival and demonymic forms for countries and nations — Wikipedia, the free encyclopedia. [Online; accessed 28-June-2023].
  51. Wiktionary contributors. 2021. Category - english terms of address — Wiktionary, the free dictionary. [Online; accessed 28-June-2023].
  52. Ex machina: Personal attacks seen at scale. In Proceedings of the 26th international conference on world wide web, pages 1391–1399.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Emmanuel Klu (3 papers)
  2. Sameer Sethi (3 papers)

Summary

We haven't generated a summary for this paper yet.