Dissecting Causal Biases
Abstract: Accurately measuring discrimination in machine learning-based automated decision systems is required to address the vital issue of fairness between subpopulations and/or individuals. Any bias in measuring discrimination can lead to either amplification or underestimation of the true value of discrimination. This paper focuses on a class of bias originating in the way training data is generated and/or collected. We call such class causal biases and use tools from the field of causality to formally define and analyze such biases. Four sources of bias are considered, namely, confounding, selection, measurement, and interaction. The main contribution of this paper is to provide, for each source of bias, a closed-form expression in terms of the model parameters. This makes it possible to analyze the behavior of each source of bias, in particular, in which cases they are absent and in which other cases they are maximized. We hope that the provided characterizations help the community better understand the sources of bias in machine learning applications.
- Machine bias. propublica. See https://www. propublica. org/article/machine-bias-risk-assessments-in-criminal-sentencing, 2016.
- Machine bias. In Ethics of data and analytics, pp. 254–264. Auerbach Publications, 2022.
- Causal discovery for fairness. arXiv preprint arXiv:2206.06685, 2022.
- Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pp. 77–91, 2018.
- Harald Cramér. Mathematical methods of statistics, volume 43. Princeton university press, 1999.
- Sebastien Haneuse. Distinguishing selection bias and confounding bias in comparative effectiveness research. Medical care, 54(4):e23, 2016.
- Fairness through equality of effort. In Companion Proceedings of the Web Conference 2020, pp. 743–751, 2020.
- Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.
- Causal interaction and effect modification: same model, different concepts. Political Science Research and Methods, 9(3):641–649, 2021. doi: 10.1017/psrm.2020.12.
- A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(3):e1452, 2022.
- A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6):1–35, 2021.
- Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464):447–453, 2019.
- University of Oxford. Catalogue of bias. https://catalogofbias.org/biases, 2021. Accessed: 2023-03-30.
- Catherine O’Neill. Weapons of math destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishers, 2016. ISBN 0553418815.
- Judea Pearl. Causality. Cambridge university press, 2009.
- Judea Pearl. On measurement bias in causal inference. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, pp. 425–432, 2010.
- Judea Pearl. Linear models: A useful “microscope” for causal analysis. Journal of Causal Inference, 1(1):155–170, 2013.
- Kimberly Quick. The unfair effects of impact on teachers with the toughest jobs. The Century Foundation, 2015. https://tcf.org/content/commentary/the-unfair-effects-of-impact-on-teachers-with-the-toughest-jobs/?agreed=1.
- John Rawls. A theory of justice: Revised edition. Harvard university press, 2020.
- Modern epidemiology, volume 3. Wolters Kluwer Health/Lippincott Williams & Wilkins Philadelphia, 2008.
- Edward H Simpson. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological), 13(2):238–241, 1951.
- Tyler J VanderWeele. Controlled direct and mediated effects: definition, identification and bounds. Scandinavian Journal of Statistics, 38(3):551–563, 2011.
- The identification of synergism in the sufficient-component-cause framework. Epidemiology, 18(3):329–339, 2007.
- Clarice R Weinberg. Can dags clarify effect modification? Epidemiology (Cambridge, Mass.), 18(5):569, 2007.
- Sewall Wright. Correlation and causation. Journal of Agricultural Research, 20:557–585, 1921.
- Causal modeling-based discrimination discovery and removal: criteria, bounds, and algorithms. IEEE Transactions on Knowledge and Data Engineering, 31(11):2035–2050, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.