Multiaccuracy: Black-Box Post-Processing for Fairness in Classification (1805.12317v2)

Published 31 May 2018 in cs.LG and stat.ML

Abstract: Prediction systems are successfully deployed in applications ranging from disease diagnosis, to predicting credit worthiness, to image recognition. Even when the overall accuracy is high, these systems may exhibit systematic biases that harm specific subpopulations; such biases may arise inadvertently due to underrepresentation in the data used to train a machine-learning model, or as the result of intentional malicious discrimination. We develop a rigorous framework of multiaccuracy auditing and post-processing to ensure accurate predictions across identifiable subgroups. Our algorithm, MULTIACCURACY-BOOST, works in any setting where we have black-box access to a predictor and a relatively small set of labeled data for auditing; importantly, this black-box framework allows for improved fairness and accountability of predictions, even when the predictor is minimally transparent. We prove that MULTIACCURACY-BOOST converges efficiently and show that if the initial model is accurate on an identifiable subgroup, then the post-processed model will be also. We experimentally demonstrate the effectiveness of the approach to improve the accuracy among minority subgroups in diverse applications (image classification, finance, population health). Interestingly, MULTIACCURACY-BOOST can improve subpopulation accuracy (e.g. for "black women") even when the sensitive features (e.g. "race", "gender") are not given to the algorithm explicitly.

Citations (314)

View on Semantic Scholar

Summary

The paper presents Multiaccuracy Boost, a novel algorithm that iteratively refines predictions to achieve fairness across varied subpopulations.
It employs a small validation set and an auditing process, adjusting model outputs without direct access to sensitive features.
Empirical tests in domains like image, financial, and healthcare tasks show improved subgroup accuracy while preserving overall model performance.

Overview of Multiaccuracy: Black-Box Post-Processing for Fairness in Classification

The presented paper introduces a methodological framework for post-processing in prediction systems aimed at ensuring fairness across various subpopulations. This approach, termed Multiaccuracy Boost, operates as a black-box post-processing mechanism that addresses systemic biases inherent in machine learning models, which typically arise either inadvertently due to underrepresentation in datasets or as a result of deliberate biased algorithms.

Key Contributions

Multiaccuracy Framework: The paper develops a novel paradigm called "multiaccuracy," which demands unbiased predictions across subpopulations identified statistically. It delineates a setting where, given a classifier with a small validation set, an algorithm ensures predictions are verifiably unbiased. The central tool here is a learning algorithm called an auditor that can be used to adjust predictions iteratively.
Algorithmic Development: A key contribution is the Multiaccuracy Boost algorithm that iteratively refines predictions to satisfy multiaccuracy criteria. A substantial attribute of this algorithm is its reliance on black-box access to machine-learning models, making it adaptable to use with various pre-existing systems.
Empirical Evaluation: The practicality of Multiaccuracy Boost is empirically validated across tasks such as image classification, financial predictions, and health-based assessments. The results demonstrate improved accuracy among minority subgroups, showcasing the algorithm's capacity to boost subgroup accuracy even absent explicit identification of sensitive features like race or gender.

Theoretical Underpinnings

The authors offer rigorous proofs regarding the convergence of Multiaccuracy Boost, asserting its computational efficiency and ensuring that accuracy among identified subgroups is maintained or improved. Importantly, the algorithm supports a "do-no-harm" principle, guaranteeing that accuracy does not degrade for subgroups where the baseline model already performs well.

Implications and Future Directions

Fairness in Machine Learning: The utility of multiaccuracy lies in its ability to improve subgroup accuracy without needing access to sensitive subgroup identifiers directly, thereby aligning with principles of fairness and accountability.
Adoption in Real-World Applications: This methodology could transform industry practices by providing a robust framework to audit and refine black-box algorithms, especially in domains where models are often opaque, such as healthcare and finance.
Further Research: While Multiaccuracy Boost provides substantive advancements, future research may focus on expanding to other types of fairness, such as multicalibration, and exploring its applications across diverse datasets and more complex models.

In summary, the paper articulates a significant step toward equitable machine-learning practices through a well-substantiated and versatile post-processing algorithm. By enhancing transparency and performance across subgroup predictions, it holds the potential to effect widespread, impactful changes in the deployment of predictive systems.

PDF Markdown