Explaining Knock-on Effects of Bias Mitigation (2312.00765v1)
Abstract: In machine learning systems, bias mitigation approaches aim to make outcomes fairer across privileged and unprivileged groups. Bias mitigation methods work in different ways and have known "waterfall" effects, e.g., mitigating bias at one place may manifest bias elsewhere. In this paper, we aim to characterise impacted cohorts when mitigation interventions are applied. To do so, we treat intervention effects as a classification task and learn an explainable meta-classifier to identify cohorts that have altered outcomes. We examine a range of bias mitigation strategies that work at various stages of the model life cycle. We empirically demonstrate that our meta-classifier is able to uncover impacted cohorts. Further, we show that all tested mitigation strategies negatively impact a non-trivial fraction of cases, i.e., people who receive unfavourable outcomes solely on account of mitigation efforts. This is despite improvement in fairness metrics. We use these results as a basis to argue for more careful audits of static mitigation interventions that go beyond aggregate metrics.
- Bias mitigation with AIF360: A comparative study.
- Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20.
- Fairness in Machine Learning: A Survey, October 2020.
- Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259–268, Sydney NSW Australia, August 2015. ACM.
- A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 329–338, Atlanta GA USA, January 2019. ACM.
- Equality of Opportunity in Supervised Learning.
- Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1–33, October 2012.
- Classification with No Discrimination by Preferential Sampling.
- Decision Theory for Discrimination-Aware Classification. In 2012 IEEE 12th International Conference on Data Mining, pages 924–929, Brussels, Belgium, December 2012. IEEE.
- Fairness-Aware Classifier with Prejudice Remover Regularizer. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Madhu Sudan, Demetri Terzopoulos, Doug Tygar, Moshe Y. Vardi, Gerhard Weikum, Peter A. Flach, Tijl De Bie, and Nello Cristianini, editors, Machine Learning and Knowledge Discovery in Databases, volume 7524, pages 35–50. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
- Preventing Fairness Gerrymandering:Auditing and Learning for Subgroup Fairness.
- An Empirical Study of Rich Subgroup Fairness for Machine Learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 100–109, Atlanta GA USA, January 2019. ACM.
- When Mitigating Bias is Unfair: A Comprehensive Study on the Impact of Bias Mitigation Algorithms, February 2023.
- Leveraging Position Bias to Improve Peer Recommendation. PLoS ONE, 9(6):e98914, June 2014.
- A Unified Approach to Interpreting Model Predictions.
- Investigating Debiasing Effects on Classification and Explainability. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 468–478, Oxford United Kingdom, July 2022. ACM.
- A Survey on Bias and Fairness in Machine Learning, January 2022.
- Bank Marketing. UCI Machine Learning Repository, 2012. DOI: https://doi.org/10.24432/C5K306.
- U.S. Bureau of Labor Statistics. Demographics and Employment in the United States, 2013. URL: https://www.kaggle.com/datasets/econdata/demographics-and-employment-in-the-united-states.
- On Fairness and Calibration.
- "Why Should I Trust You?": Explaining the Predictions of Any Classifier, August 2016.
- Sieuwert van Otterloo. Utrecht Fairness Recruitment Data, 2021. URL: https://www.kaggle.com/datasets/ictinstitute/utrecht-fairness-recruitment-dataset.
- Richard Zemel. Learning Fair Representations.
- Identifying significant predictive bias in classifiers, 2017.