Automated discovery of trade-off between utility, privacy and fairness in machine learning models (2311.15691v1)
Abstract: Machine learning models are deployed as a central component in decision making and policy operations with direct impact on individuals' lives. In order to act ethically and comply with government regulations, these models need to make fair decisions and protect the users' privacy. However, such requirements can come with decrease in models' performance compared to their potentially biased, privacy-leaking counterparts. Thus the trade-off between fairness, privacy and performance of ML models emerges, and practitioners need a way of quantifying this trade-off to enable deployment decisions. In this work we interpret this trade-off as a multi-objective optimization problem, and propose PFairDP, a pipeline that uses Bayesian optimization for discovery of Pareto-optimal points between fairness, privacy and utility of ML models. We show how PFairDP can be used to replicate known results that were achieved through manual constraint setting process. We further demonstrate effectiveness of PFairDP with experiments on multiple models and datasets.
- Abowd, J.M.: Stepping-up: The census bureau tries to be a good data steward in the 21st century (2019)
- Bottou, L.: Large-scale machine learning with stochastic gradient descent. Proc. of COMPSTAT (01 2010). https://doi.org/10.1007/978-3-7908-2604-3_16
- Institute, A.N.: Litigating algorithms: Challenging government use of algorithmic decision systems (2016)
- Winkler, W.E.: Cleaning and using administrative lists: Enhanced practices and computational algorithms for record linkage and modeling/editing/imputation. Administrative Records for Survey Methodology pp. 105–138 (2021)