Correcting Underrepresentation and Intersectional Bias for Classification (2306.11112v4)
Abstract: We consider the problem of learning from data corrupted by underrepresentation bias, where positive examples are filtered from the data at different, unknown rates for a fixed number of sensitive groups. We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates, even in settings where intersectional group membership makes learning each intersectional rate computationally infeasible. Using these estimates, we construct a reweighting scheme that allows us to approximate the loss of any hypothesis on the true distribution, even if we only observe the empirical error on a biased sample. From this, we present an algorithm encapsulating this learning and reweighting process along with a thorough empirical investigation. Finally, we define a bespoke notion of PAC learnability for the underrepresentation and intersectional bias setting and show that our algorithm permits efficient learning for model classes of finite VC dimension.
- Michelle Alexander. The new jim crow. Ohio St. J. Crim. L., 9:7, 2011.
- Learning from noisy examples. Machine Learning, 2(4):343–370, 1988.
- Intersectionality in quantitative research: A systematic review of its emergence and applications of theory and methods. SSM-population health, 14:100798, 2021.
- Are emily and greg more employable than lakisha and jamal? a field experiment on labor market discrimination. American economic review, 94(4):991–1013, 2004.
- Alexander Bird. Philosophy of science, volume 5. McGill-Queen’s Press-MQUP, 1998.
- Recovering from biased data: Can fairness constraints improve accuracy? arXiv preprint arXiv:1912.01094, 2019.
- Causally interpreting intersectionality theory. Philosophy of Science, 83(1):60–81, 2016.
- Sample-efficient strategies for learning in the presence of noise. Journal of the ACM (JACM), 46(5):684–719, 1999.
- Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
- The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810, 2018.
- Patricia Hill Collins. Black feminist thought: Knowledge, consciousness, and the politics of empowerment. routledge, 2022.
- Kimberlé Crenshaw. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. u. Chi. Legal f., page 139, 1989.
- Tommy J Curry. The man-not: Race, class, genre, and the dilemmas of black manhood. Temple University Press, 2017.
- Tommy J Curry. Killing boogeymen: Phallicism and the misandric mischaracterizations of black males in theory. Res Philosophica, 2018.
- Tommy J Curry. II—Must There Be an Empirical Basis for the Theorization of Racialized Subjects in Race-Gender Theory? Proceedings of the Aristotelian Society, 121(1):21–44, 01 2021. ISSN 0066-7374. doi: 10.1093/arisoc/aoaa021. URL https://doi.org/10.1093/arisoc/aoaa021.
- Distribution-independent pac learning of halfspaces with massart noise. Advances in Neural Information Processing Systems, 32, 2019.
- Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226, 2012.
- Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016.
- Classifying without discriminating. In 2009 2nd international conference on computer, control and communication, pages 1–6. IEEE, 2009.
- Selection problems in the presence of implicit bias. arXiv preprint arXiv:1801.03533, 2018.
- Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807, 2016.
- New trends in gender and mathematics performance: a meta-analysis. Psychological bulletin, 136(6):1123, 2010.
- Judea Pearl. Causality. Cambridge university press, 2009.