Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fairness and Unfairness in Binary and Multiclass Classification: Quantifying, Calculating, and Bounding (2206.03234v2)

Published 7 Jun 2022 in cs.LG, cs.CY, and stat.ML

Abstract: We propose a new interpretable measure of unfairness, that allows providing a quantitative analysis of classifier fairness, beyond a dichotomous fair/unfair distinction. We show how this measure can be calculated when the classifier's conditional confusion matrices are known. We further propose methods for auditing classifiers for their fairness when the confusion matrices cannot be obtained or even estimated. Our approach lower-bounds the unfairness of a classifier based only on aggregate statistics, which may be provided by the owner of the classifier or collected from freely available data. We use the equalized odds criterion, which we generalize to the multiclass case. We report experiments on data sets representing diverse applications, which demonstrate the effectiveness and the wide range of possible uses of the proposed methodology. An implementation of the procedures proposed in this paper and as the code for running the experiments are provided in https://github.com/sivansabato/unfairness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Model projection: Theory and applications to fair machine learning. In 2020 IEEE International Symposium on Information Theory (ISIT), pages 2711–2716, 2020. doi: 10.1109/ISIT44484.2020.9173988.
  2. American Cancer Society. Cancer facts and statistics, table 8, 2019. URL https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2019/cancer-facts-and-figures-2019.pdf.
  3. Fairness in machine learning. NIPS Tutorial, 2017.
  4. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20.
  5. Ai fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development, 63(4/5):4–1, 2019.
  6. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, page 0049124118782533, 2018.
  7. Fliptest: Fairness auditing via optimal transport. CoRR, abs/1906.09218, 2019. URL http://arxiv.org/abs/1906.09218.
  8. Algorithmic fairness and vertical equity: Income fairness with irs tax audit models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 1479–1503, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522.
  9. Building classifiers with independency constraints. In 2009 IEEE international conference on data mining workshops, pages 13–18. IEEE, 2009.
  10. Optimized pre-processing for discrimination prevention. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3992–4001. Curran Associates, Inc., 2017a. URL http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf.
  11. Optimized pre-processing for discrimination prevention. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3992–4001. Curran Associates, Inc., 2017b.
  12. CDC. United states natality public use file, 2017. URL https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/DVS/natality/Nat2017us.zip.
  13. CDC and NCI. United states cancer statistics: Data visualizations, 2019. URL https://gis.cdc.gov/Cancer/USCS/DataViz.html.
  14. Conn Charalambous and AR Conn. An efficient method to solve the minimax problem directly. SIAM Journal on Numerical Analysis, 15(1):162–187, 1978.
  15. Fairness guarantee in multi-class classification. arxiv preprint arXiv:2109.13642, 2021.
  16. Empirical risk minimization under fairness constraints. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 2791–2801. Curran Associates, Inc., 2018.
  17. UCI machine learning repository, 2019. URL http://archive.ics.uci.edu/ml.
  18. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226. ACM, 2012.
  19. Federal Elections Commission. Federal elections 2016, 2016. URL https://transition.fec.gov/pubrec/fe2016/federalelections2016.pdf.
  20. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259–268. ACM, 2015.
  21. Five Thirty Eight. National presidential polls, november 8th, 2016, 2016. URL https://projects.fivethirtyeight.com/2016-election-forecast/national-polls/.
  22. Non-discriminatory machine learning through convex fairness criteria. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  23. Satisfying real-world goals with dataset constraints. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 2415–2423. Curran Associates, Inc., 2016.
  24. Fairness indicators for systematic assessments of visual feature extractors. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 70–88, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522.
  25. The case for process fairness in learning: Feature selection for fair decision making. In NIPS Symposium on Machine Learning and the Law, volume 1, page 2, 2016.
  26. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315–3323, 2016.
  27. Fairness behind a veil of ignorance: A welfare analysis for automated decision making. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 1265–1276. Curran Associates, Inc., 2018.
  28. Can internet search engine queries be used to diagnose diabetes? analysis of archival search data. Acta Diabetologica, 56:1149–1154, 2019.
  29. Improving ecological inference using individual-level data. Statistics in medicine, 25(12):2136–2159, 2006.
  30. Hierarchical related regression for combining aggregate and individual data in studies of socio-economic disease risk factors. Journal of the Royal Statistical Society: Series A (Statistics in Society), 171(1):159–178, 2008.
  31. An algorithm for removing sensitive information: application to race-independent recidivism prediction. The Annals of Applied Statistics, 13(1):189–220, 2019.
  32. Inherent trade-offs in the fair determination of risk scores. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.
  33. Fairness of recommender systems in the recruitment domain: an analysis from technical and legal perspectives. Frontiers in big Data, 6, 2023.
  34. Counterfactual fairness. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4066–4076. Curran Associates, Inc., 2017.
  35. Noise-tolerant fair classification. In Advances in Neural Information Processing Systems 32, pages 294–306. Curran Associates, Inc., 2019. URL http://papers.nips.cc/paper/8322-noise-tolerant-fair-classification.pdf.
  36. Characterizing bias in classifiers using generative models. In Advances in Neural Information Processing Systems 32, pages 5404–5415. Curran Associates, Inc., 2019.
  37. The cost of fairness in binary classification. In Conference on Fairness, Accountability and Transparency, pages 107–118, 2018.
  38. Numerical optimization. Springer Science & Business Media, 2006.
  39. On fairness and calibration. In Advances in Neural Information Processing Systems, pages 5680–5689, 2017.
  40. ProPublica. Compas analysis github repository. https://github.com/propublica/compas-analysis, 2016.
  41. An outcome test of discrimination for ranked lists. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 350–356, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522.
  42. The age of secrecy and unfairness in recidivism prediction. Harvard Data Science Review, 2(1):1, 2020.
  43. Bounding the fairness and accuracy of classifiers from population statistics. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8316–8325. PMLR, 13–18 Jul 2020.
  44. Classification with asymmetric label noise: Consistency and maximal denoising. In Shai Shalev-Shwartz and Ingo Steinwart, editors, Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pages 489–511, Princeton, NJ, USA, 12–14 Jun 2013. PMLR.
  45. Detecting impending stroke from cognitive traits evident in internet searches: analysis of archival data. Journal of Medical Internet Research, 23(5):e27084, 2021.
  46. A unified approach to quantifying algorithmic unfairness: Measuring individual &group unfairness via inequality indices. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2239–2248, 2018.
  47. A probabilistic approach for learning with label proportions applied to the us presidential election. In 2017 IEEE International Conference on Data Mining (ICDM), pages 445–454, Nov 2017. doi: 10.1109/ICDM.2017.54.
  48. Ozgu Turgut. Fair classification with ensembles. In 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), pages 1–4. IEEE, 2023.
  49. US Census Bureau. Annual estimates of the resident population for the united states, regions, states, and puerto rico: April 1, 2010 to july 1, 2019, 2019. URL https://www2.census.gov/programs-surveys/popest/tables/2010-2019/state/totals/nst-est2019-01.xlsx.
  50. USDA Economic Research Service. Highest level of educational attainment, 2021. URL https://data.ers.usda.gov/reports.aspx?ID=17829.
  51. How fair can we go in machine learning? assessing the boundaries of accuracy and fairness. International Journal of Intelligent Systems, 36(4):1619–1643, 2021. doi: https://doi.org/10.1002/int.22354. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/int.22354.
  52. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), pages 1–7. IEEE, 2018.
  53. General election results from 1918 to 2019, 2020. URL https://commonslibrary.parliament.uk/research-briefings/cbp-8647/.
  54. Learning non-discriminatory predictors. In Satyen Kale and Ohad Shamir, editors, Proceedings of the 2017 Conference on Learning Theory, volume 65 of Proceedings of Machine Learning Research, pages 1920–1953, Amsterdam, Netherlands, 07–10 Jul 2017. PMLR.
  55. On convexity and bounds of fairness-aware classification. In The World Wide Web Conference, pages 3356–3362. ACM, 2019.
  56. Identifying amyotrophic lateral sclerosis through interactions with an internet search engine. Muscle & Nerve, 69(1):40–47, 2024.
  57. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171–1180. International World Wide Web Conferences Steering Committee, 2017a.
  58. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics, pages 962–970, 2017b.
  59. Learning fair representations. In Sanjoy Dasgupta and David McAllester, editors, Proceedings of the 30th International Conference on Machine Learning, volume 28(3) of Proceedings of Machine Learning Research, pages 325–333, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR.

Summary

We haven't generated a summary for this paper yet.