Differential Parity: Relative Fairness Between Two Sets of Decisions
Abstract: With AI systems widely applied to assist human in decision-making processes such as talent hiring, school admission, and loan approval; there is an increasing need to ensure that the decisions made are fair. One major challenge for analyzing fairness in decisions is that the standards are highly subjective and contextual -- there is no consensus for what absolute fairness means for every scenario. Not to say that different fairness standards often conflict with each other. To bypass this issue, this work aims to test relative fairness in decisions. That is, instead of defining what are absolutely'' fair decisions, we propose to test the relative fairness of one decision set against another with differential parity -- the difference between two sets of decisions should be independent from a certain sensitive attribute. This proposed differential parity fairness notion has the following benefits: (1) it avoids the ambiguous and contradictory definition ofabsolutely'' fair decisions; (2) it reveals the relative preference and bias between two decision sets; (3) differential parity can serve as a new group fairness notion when a reference set of decisions (ground truths) is provided. One limitation for differential parity is that, it requires the two sets of decisions under comparison to be made on the same data subjects. To overcome this limitation, we propose to utilize a machine learning model to bridge the gap between the two decisions sets made on difference data and estimate the differential parity.
- Doaa Abu-Elyounes. Contextual fairness: A legal and policy analysis of algorithmic fairness. U. Ill. JL Tech. & Pol’y, pp.  1, 2020.
- Null hypothesis testing: problems, prevalence, and an alternative. The journal of wildlife management, pp. 912–923, 2000.
- Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing, 2016.
- Ron Artstein. Inter-annotator agreement. In Handbook of linguistic annotation, pp. 297–313. Springer, 2017.
- Bob Carpenter. Multilevel bayesian models of categorical data annotation. 2008. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.174.1374&rep=rep1&type=pdf.
- Siu L Chow. Significance test or effect size? Psychological bulletin, 103(1):105, 1988.
- Jacob Cohen. Statistical power analysis for the behavioral sciences. Routledge, 2013.
- Jeffrey Dastin. Amazon scraps secret ai recruiting tool that showed bias against women. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G, 2018.
- Maximum likelihood estimation of observer error-rates using the em algorithm. 28(1):20–28, 1979. ISSN 00359254, 14679876. URL http://www.jstor.org/stable/2346806.
- UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
- Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226, 2012.
- Momresp: A bayesian model for multi-annotator document labeling. In LREC, pp. 3704–3711, 2014.
- The (im) possibility of fairness: Different value systems require different mechanisms for fair decision making. Communications of the ACM, 64(4):136–143, 2021.
- Equality of opportunity in supervised learning. In Advances in neural information processing systems, pp. 3315–3323, 2016.
- Learning whom to trust with MACE. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1120–1130, Atlanta, Georgia, June 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/N13-1132.
- Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD workshop on human computation, pp. 64–67, 2010.
- Parting crowds: Characterizing divergent interpretations in crowdsourced annotation tasks. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, pp. 1637–1648, 2016.
- Arjun Kharpal. Health care start-up says a.i. can diagnose patients better than humans can, doctors call that ’dubious’. https://www.cnbc.com/2018/06/28/babylon-claims-its-ai-can-diagnose-patients-better-than-doctors.html, June 2018.
- Scut-fbp5500: A diverse benchmark dataset for multi-paradigm facial beauty prediction. In 2018 24th International conference on pattern recognition (ICPR), pp. 1598–1603. IEEE, 2018.
- Learning to Predict Population-Level Label Distributions. In Seventh AAAI Conference on Human Computation and Crowdsourcing, volume 7, pp. 68–76, 2019.
- Parmy Olson. The algorithm that beats your bank manager. https://www.forbes.com/sites/parmyolson/2011/03/15/the-algorithm-that-beats-your-bank-manager/##15da2651ae99, 2011.
- Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 877–885, 2010.
- Racial discrimination among nba referees. The Quarterly journal of economics, 125(4):1859–1887, 2010.
- Learning from crowds. Journal of Machine Learning Research, 11(4), 2010.
- The risk of racial bias in hate speech detection. In Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 1668–1678, 2019.
- Shlomo S Sawilowsky. New effect size rules of thumb. Journal of modern applied statistical methods, 8(2):26, 2009.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Neighborhood-based Pooling for Population-level Label Distribution Learning. In Twenty Fourth European Conference on Artificial Intelligence, 2020. URL https://arxiv.org/abs/2003.07406.
- Improving label quality by jointly modeling items and annotators. 2021.
- Bernard L Welch. The generalization of ‘student’s’problem when several different population varlances are involved. Biometrika, 34(1-2):28–35, 1947.
- Do black judges make a difference? American Journal of Political Science, pp. 126–136, 1988.
- Human intelligence needs artificial intelligence. In Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.