Modeling Latent Selection with Structural Causal Models (2401.06925v2)
Abstract: Selection bias is ubiquitous in real-world data, and can lead to misleading results if not dealt with properly. We introduce a conditioning operation on Structural Causal Models (SCMs) to model latent selection from a causal perspective. We show that the conditioning operation transforms an SCM with the presence of an explicit latent selection mechanism into an SCM without such selection mechanism, which partially encodes the causal semantics of the selected subpopulation according to the original SCM. Furthermore, we show that this conditioning operation preserves the simplicity, acyclicity, and linearity of SCMs, and commutes with marginalization. Thanks to these properties, combined with marginalization and intervention, the conditioning operation offers a valuable tool for conducting causal reasoning tasks within causal models where latent details have been abstracted away. We demonstrate by example how classical results of causal inference can be generalized to include selection bias and how the conditioning operation helps with modeling of real-world problems.
- Controlling selection bias in causal inference. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, volume 22 of Proceedings of Machine Learning Research, pages 100–108, La Palma, Canary Islands. PMLR.
- Recovering causal effects from selection bias. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 29, page 2410–2416.
- Berkson, J. (1946). Limitations of the application of fourfold table analysis to hospital data. Biometrics Bulletin, 2(3):47–53.
- Foundations of structural causal models with cycles and latent variables. The Annals of Statistics, 49(5):2885–2915.
- Constraint-based causal discovery for non-linear structural causal models with cycles and latent confounders. Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI-18), pages 269–278.
- Fryer Jr, R. G. (2019). An empirical analysis of racial differences in police use of force. Journal of Political Economy, 127(3):1210–1261.
- A structural approach to selection bias. Epidemiology, 15(5):615–625.
- Constraint-based causal discovery: Conflict resolution with answer set programming. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI-14), page 340–349.
- A common-cause principle for eliminating selection bias in causal estimands through covariate adjustment. OSF Preprints ths4e, Center for Open Science.
- Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition.
- Reichenbach, H. (1956). The Direction of Time. Dover Publications, Mineola, N.Y.
- Ancestral graph markov models. The Annals of Statistics, 30(4):962–1030.
- Smith, L. H. (2020). Selection mechanisms and their consequences: understanding and addressing selection bias. Current Epidemiology Reports, 7:179–189.
- Simpson’s paradox in COVID-19 case fatality rates: a mediation analysis of age-related causal effects. IEEE Transactions on Artificial Intelligence, 2(1):18–27.
- Zhang, J. (2008). Causal reasoning with ancestral graphs. Journal of Machine Learning Research, 9:1437–1474.
- BETS: The dangers of selection bias in early analyses of the coronavirus disease (COVID-19) pandemic. The Annals of Applied Statistics, 15(1):363–390.