Balancing Fairness and Accuracy in Data-Restricted Binary Classification (2403.07724v1)
Abstract: Applications that deal with sensitive information may have restrictions placed on the data available to a ML classifier. For example, in some applications, a classifier may not have direct access to sensitive attributes, affecting its ability to produce accurate and fair decisions. This paper proposes a framework that models the trade-off between accuracy and fairness under four practical scenarios that dictate the type of data available for analysis. Prior works examine this trade-off by analyzing the outputs of a scoring function that has been trained to implicitly learn the underlying distribution of the feature vector, class label, and sensitive attribute of a dataset. In contrast, our framework directly analyzes the behavior of the optimal Bayesian classifier on this underlying distribution by constructing a discrete approximation it from the dataset itself. This approach enables us to formulate multiple convex optimization problems, which allow us to answer the question: How is the accuracy of a Bayesian classifier affected in different data restricting scenarios when constrained to be fair? Analysis is performed on a set of fairness definitions that include group and individual fairness. Experiments on three datasets demonstrate the utility of the proposed framework as a tool for quantifying the trade-offs among different fairness notions and their distributional dependencies.
- Machine Bias. In Ethics of Data and Analytics. Auerbach Publications, 254–264.
- Richard K Berg. 1964. Equal Employment Opportunity Under the Civil Rights Act of 1964. Brook. L. Rev. 31 (1964), 62.
- Stephen P Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
- Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Conference on Fairness, Accountability and Transparency. PMLR, 77–91.
- Optimized Pre-processing for Discrimination Prevention. Advances in Neural Information Processing Systems 30 (2017).
- Classification with Fairness Constraints: A Meta-algorithm with Provable Guarantees. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 319–328.
- Mingliang Chen and Min Wu. 2020. Towards Threshold Invariant Fair Classification. In Conference on Uncertainty in Artificial Intelligence. PMLR, 560–569.
- Alexandra Chouldechova. 2017. Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data 5, 2 (2017), 153–163.
- US Congress. 1968. Fair Housing Act. http://www.justice.gov/crt/fair-housing-act-2
- US Congress. 1974-10. Equal Credit Opportunity Act. http://www.ecfr.gov/cgi-bin/text-idx?tpl=/ecfrbrowse/Title12/12cfr202_main_02.tpl
- Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining. 797–806.
- George Dantzig. 1963. Linear Programming and Extensions. Princeton University Press.
- CryptoCredit: Securely Training Fair Models. In Proceedings of the First ACM International Conference on AI in Finance. 1–8.
- Efficient Projections onto the l1-ball for Learning in High Dimensions. In Proceedings of the 25th International Conference on Machine Learning. 272–279.
- Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226.
- Allen Gersho and Robert M Gray. 1991. Vector Quantization and Signal Compression. Vol. 159. Springer Science & Business Media.
- Equality of Opportunity in Supervised Learning. Advances in Neural Information Processing Systems 29 (2016).
- Pushing the Limits of Fairness Impossibility: Who's the Fairest of Them All?. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 32749–32761. https://proceedings.neurips.cc/paper_files/paper/2022/file/d3222559698f41247261b7a6c2bbaedc-Paper-Conference.pdf
- Wasserstein Fair Classification. In Uncertainty in Artificial Intelligence. PMLR, 862–872.
- Faisal Kamiran and Toon Calders. 2012. Data Preprocessing Techniques for Classification Without Discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.
- FACT: A Diagnostic for Group Fairness Trade-offs. In International Conference on Machine Learning. PMLR, 5264–5274.
- Inherent Trade-Offs in the Fair Determination of Risk Scores. (09 2016).
- Ron Kohavi et al. 1996. Scaling up the Accuracy of Naive-Bayes Classifiers: A Decision-tree Hybrid.. In KDD, Vol. 96. 202–207.
- Counterfactual fairness. Advances in Neural Information Processing Systems 30 (2017).
- How We Examined Racial Discrimination in Auto Insurance Prices. ProPublica, April 5 (2017).
- How We Analyzed the COMPAS Recidivism Algorithm.
- An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications 28, 1 (1980), 84–95.
- Suyun Liu and Luis Nunes Vicente. 2022. Accuracy and Fairness Trade-offs in Machine Learning: A Stochastic Multi-objective Approach. Computational Management Science 19, 3 (2022), 513–537.
- Stuart Lloyd. 1982. Least Squares Quantization in PCM. IEEE Transactions on Information Theory 28, 2 (1982), 129–137.
- Bias Mitigation Post-processing for Individual and Group Fairness. In ICASSP. IEEE, 2847–2851.
- k-NN as an Implementation of Situation Testing for Discrimination Discovery and Prevention. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 502–510.
- Aditya Krishna Menon and Robert C Williamson. 2018. The Cost of Fairness in Binary Classification. In Conference on Fairness, Accountability and Transparency. PMLR, 107–118.
- Jérôme Pagès. 2014. Multiple Factor Analysis by Example using R. CRC Press.
- Post-processing for Individual Fairness. Advances in Neural Information Processing Systems 34 (2021), 25944–25955.
- On Fairness and Calibration. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/b8b9c74ac526fffbeb2d39ab038d1cd7-Paper.pdf
- Latanya Sweeney. 2013. Discrimination in Online Ad Delivery. Commun. ACM 56, 5 (2013), 44–54.
- Paul Van der Laan. 2001. The 2001 Census in the Netherlands: Integration of Registers and Surveys. 39–52.
- Linda F Wightman. 1998. LSAC National Longitudinal Bar Passage Study. LSAC Research Report Series. (1998).
- D Randall Wilson and Tony R Martinez. 1997. Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6 (1997), 1–34.
- Modeling Tabular Data using Conditional GAN. Advances in Neural Information Processing Systems 32 (2019).
- Fairness Constraints: Mechanisms for Fair Classification. In Artificial Intelligence and Statistics. PMLR, 962–970.
- Learning fair representations. In International Conference on Machine Learning. PMLR, 325–333.
- Han Zhao and Geoffrey J Gordon. 2022. Inherent Tradeoffs in Learning Fair Representations. The Journal of Machine Learning Research 23, 1 (2022), 2527–2552.
- Zachary McBride Lazri (3 papers)
- Danial Dervovic (24 papers)
- Antigoni Polychroniadou (17 papers)
- Ivan Brugere (21 papers)
- Dana Dachman-Soled (11 papers)
- Min Wu (201 papers)