Repairing Regressors for Fair Binary Classification at Any Decision Threshold (2203.07490v4)
Abstract: We study the problem of post-processing a supervised machine-learned regressor to maximize fair binary classification at all decision thresholds. By decreasing the statistical distance between each group's score distributions, we show that we can increase fair performance across all thresholds at once, and that we can do so without a large decrease in accuracy. To this end, we introduce a formal measure of Distributional Parity, which captures the degree of similarity in the distributions of classifications for different protected groups. Our main result is to put forward a novel post-processing algorithm based on optimal transport, which provably maximizes Distributional Parity, thereby attaining common notions of group fairness like Equalized Odds or Equal Opportunity at all thresholds. We demonstrate on two fairness benchmarks that our technique works well empirically, while also outperforming and generalizing similar techniques from related work.
- Barycenters in the Wasserstein Space. SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011.
- Fairness and Machine Learning. fairmlbook.org, 2019. http://www.fairmlbook.org.
- Richard P Brent. Algorithms for minimization without derivatives. Courier Corporation, 2013.
- Building Classifiers with Independency Constraints. In 2009 IEEE International Conference on Data Mining Workshops, pages 13–18, 2009. doi: 10.1109/ICDMW.2009.83.
- Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, 2016.
- A minimax framework for quantifying risk-fairness trade-off in regression. The Annals of Statistics, 2022.
- Fair regression with Wasserstein barycenters. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7321–7331. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/51cdbd2611e844ece5d80878eb770436-Paper.pdf.
- Retiring adult: New datasets for fair machine learning. arXiv preprint arXiv:2108.04884, 2021.
- UCI Machine Learning Repository, 2017. URL http://archive.ics.uci.edu/ml.
- Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, pages 259–268, 2015.
- Obtaining Fairness using Optimal Transport Theory. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2357–2365. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/gordaliza19a.html.
- Equality of Opportunity in Supervised Learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NeurIPS ’16, pages 3323–3331, 2016.
- Wasserstein Fair Classification. In Ryan P. Adams and Vibhav Gogate, editors, Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of Machine Learning Research, pages 862–872. PMLR, 22–25 Jul 2020. URL https://proceedings.mlr.press/v115/jiang20a.html.
- The fairness of risk scores beyond classification: Bipartite ranking and the xauc metric, 2019.
- Jon Kleinberg. Inherent Trade-Offs in Algorithmic Fairness. SIGMETRICS Perform. Eval. Rev., 46(1):40, jun 2018. ISSN 0163-5999. doi: 10.1145/3292040.3219634. URL https://doi.org/10.1145/3292040.3219634.
- Existence and consistency of wasserstein barycenters. Probability Theory and Related Fields, 168(3):901–917, 2017.
- Projection to Fairness in Statistical Learning, 2020. URL https://arxiv.org/abs/2005.11720. arXiv preprint.
- Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res., 12:2825–2830, nov 2011. ISSN 1532-4435.
- Computational optimal transport, 2018. URL https://arxiv.org/abs/1803.00567.
- On fairness and calibration. Advances in Neural Information Processing Systems, 30:5680–5689, 2017.
- Filippo Santambrogio. Optimal Transport for Applied Mathematicians. Birkäuser, NY, 55(58-63):94, 2015.
- V. S. Varadarajan. On the convergence of sample probability distributions. Sankhyā: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958. ISSN 00364452. URL http://www.jstor.org/stable/25048365.
- Cédric Villani. Optimal transport: Old and new, 2008.
- Kweku Kwegyir-Aggrey (5 papers)
- A. Feder Cooper (32 papers)
- Jessica Dai (12 papers)
- John Dickerson (22 papers)
- Keegan Hines (9 papers)
- Suresh Venkatasubramanian (60 papers)