Multi-CATE: Multi-Accurate Conditional Average Treatment Effect Estimation Robust to Unknown Covariate Shifts (2405.18206v2)
Abstract: Estimating heterogeneous treatment effects is important to tailor treatments to those individuals who would most likely benefit. However, conditional average treatment effect predictors may often be trained on one population but possibly deployed on different, possibly unknown populations. We use methodology for learning multi-accurate predictors to post-process CATE T-learners (differenced regressions) to become robust to unknown covariate shifts at the time of deployment. The method works in general for pseudo-outcome regression, such as the DR-learner. We show how this approach can combine (large) confounded observational and (smaller) randomized datasets by learning a confounded predictor from the observational dataset, and auditing for multi-accuracy on the randomized controlled trial. We show improvements in bias and mean squared error in simulations with increasingly larger covariate shift, and on a semi-synthetic case study of a parallel large observational study and smaller randomized controlled experiment. Overall, we establish a connection between methods developed for multi-distribution learning and achieve appealing desiderata (e.g. external validity) in causal inference and machine learning.
- The impact of modeling decisions in statistical profiling. Data & Policy, 5:e32, 2023. doi: 10.1017/dap.2023.29.
- Evaluation of cardiovascular diseases risk calculators for cvds prevention and management: scoping review. BMC Public Health, 22(1):1742, 2022.
- A general algorithm for deciding transportability of experimental results. Journal of causal Inference, 1(1):107–134, 2013.
- The variational method of moments. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(3):810–841, 2023.
- A distributionally robust boosting algorithm. In 2019 Winter Simulation Conference (WSC), pages 3728–3739. IEEE, 2019.
- Robust fitted-q-evaluation and iteration under sequentially exogenous unobserved confounders. arXiv preprint arXiv:2302.00662, 2023.
- Augmented balancing weights as linear regression. arXiv preprint arXiv:2304.14545, 2023.
- Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 1721–1730, 2015.
- Robust covariate shift regression. In Artificial Intelligence and Statistics, pages 1270–1279. PMLR, 2016.
- Generic machine learning inference on heterogeneous treatment effects in randomized experiments, with an application to immunization in india. Technical report, National Bureau of Economic Research, 2018.
- Causal inference methods for combining randomized trials and observational studies: a review. arXiv preprint arXiv:2011.08047, 2020.
- Bruno Crépon and Gerard J Van Den Berg. Active labor market policies. Annual Review of Economics, 8:521–546, 2016.
- Minimax estimation of conditional moment models. Advances in Neural Information Processing Systems, 33:12248–12262, 2020.
- Doubly-valid/doubly-sharp sensitivity analysis for causal inference with unmeasured confounding. arXiv preprint arXiv:2112.11449, 2021.
- Learning models with uniform performance via distributionally robust optimization. arXiv preprint arXiv:1810.08750, 2018.
- Minimax kernel machine learning for a class of doubly robust functionals with application to proximal causal inference. In International conference on artificial intelligence and statistics, pages 7210–7239. PMLR, 2022.
- The accuracy, equity, and jurisprudence of criminal risk assessment. In Research handbook on big data law, pages 9–28. Edward Elgar Publishing, 2021.
- Loss minimization through the lens of outcome indistinguishability. arXiv preprint arXiv:2210.08649, 2022a.
- Low-degree multicalibration. In Conference on Learning Theory, pages 3193–3234. PMLR, 2022b.
- Robust learning with the hilbert-schmidt independence criterion. In International Conference on Machine Learning, pages 3759–3768. PMLR, 2020.
- The epic sepsis model falls short—the importance of external validation. JAMA Internal Medicine, 181(8):1040–1041, 2021.
- Generalizing off-policy learning under sample selection bias. arXiv preprint arXiv:2112.01387, 2021.
- Multicalibration: Calibration for the (computationally-identifiable) masses. In International Conference on Machine Learning, pages 1939–1948. PMLR, 2018.
- Jennifer L Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
- Robust causal inference under covariate shift via worst-case subpopulation treatment effects. In Conference on Learning Theory, pages 2079–2084. PMLR, 2020.
- Generalization bounds and representation learning for estimation of potential outcomes and causal effects. Journal of Machine Learning Research, 23(166):1–50, 2022.
- Confounding-robust policy improvement. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/file/3a09a524440d44d7f19870070a5ad42f-Paper.pdf.
- Confounding-robust policy evaluation in infinite-horizon reinforcement learning. Advances in Neural Information Processing Systems, 33:22293–22304, 2020.
- Minimax-optimal policy learning under unobserved confounding. Management Science, 67(5):2870–2890, 2021.
- Interval estimation of individual-level causal effects under unobserved confounding. arXiv preprint arXiv:1810.02894, 2018a.
- Removing hidden confounding by experimental grounding. Advances in neural information processing systems, 31, 2018b.
- Edward H Kennedy. Optimal doubly robust estimation of heterogeneous causal effects. arXiv preprint arXiv:2004.14497, 2020.
- Edward H Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17(2):3008–3049, 2023.
- Multiaccuracy: Black-box post-processing for fairness in classification. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 247–254, 2019.
- Universal adaptability: Target-independent inference that competes with propensity scoring. Proceedings of the National Academy of Sciences, 119(4):e2108097119, 2022.
- Predictive algorithms in the delivery of public employment services. In Handbook of Labour Market Policy in Advanced Democracies, pages 387–398. Edward Elgar Publishing, 2023.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10):4156–4165, 2019. doi: 10.1073/pnas.1804597116.
- Laura and John Arnold Foundation. Public safety assessment decision making framework - cook county, il [effective march 2016]. https://news.wttw.com/sites/default/files/article/file-attachments/PSA%20Decision%20Making%20Framework.pdf, 2016.
- Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9):1989–2001, 2009. ISSN 0047-259X. doi: https://doi.org/10.1016/j.jmva.2009.04.008. URL https://www.sciencedirect.com/science/article/pii/S0047259X09000876.
- Robust classification under sample selection bias. Advances in neural information processing systems, 27, 2014.
- K. Machens and K. Schmidt‐Gollwitzer. Issues to debate on the Women’s Health Initiative (WHI) study. Hormone replacement therapy: an epidemiological dilemma? Human Reproduction, 18(10):1992–1999, 2003. ISSN 0268-1161. doi: 10.1093/humrep/deg406.
- X Nie and S Wager. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2020. doi: 10.1093/biomet/asaa076. URL https://doi.org/10.1093/biomet/asaa076.
- Orthogonal random forest for causal inference. In International Conference on Machine Learning, pages 4932–4941. PMLR, 2019.
- mcboost: Multi-calibration boosting for r. Journal of Open Source Software, 6(64):3453, 2021.
- R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2020. URL https://www.R-project.org/.
- Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427):846–866, 1994.
- Debiased machine learning of conditional average treatment effects and other causal functions. The Econometrics Journal, 24(2):264–289, 2021.
- Estimating individual treatment effect: generalization bounds and algorithms. In International conference on machine learning, pages 3076–3085. PMLR, 2017.
- Adapting neural networks for the estimation of treatment effects. arXiv preprint arXiv:1906.02120, 2019.
- Multi-study r-learner for estimating heterogeneous treatment effects across studies using statistical machine learning, 2024.
- Pietro Emilio Spini. Robustness, heterogeneous treatment effects and covariate shifts. arXiv preprint arXiv:2112.09259, 2021.
- Evaluating model robustness and stability to dataset shift. In International Conference on Artificial Intelligence and Statistics, pages 2611–2619. PMLR, 2021.
- Zhiqiang Tan. A distributional approach for causal inference using propensity scores. Journal of the American Statistical Association, 101(476):1619–1637, 2006.
- grf: Generalized Random Forests, 2021. URL https://CRAN.R-project.org/package=grf. R package version 2.0.2.
- Elizabeth Tipton. How generalizable is your experiment? an index for comparing experimental samples and populations. Journal of Educational and Behavioral Statistics, 39(6):478–501, 2014.
- Generalizability and transportability. In Handbook of Matching and Weighting Adjustments for Causal Inference, pages 39–60. Chapman and Hall/CRC, 2023.
- John Wilder Tukey et al. Exploratory data analysis, volume 2. Springer, 1977.
- Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018. doi: 10.1080/01621459.2017.1319839.
- Robust learning under uncertain test distributions: Relating covariate shift to model misspecification. In International Conference on Machine Learning, pages 631–639. PMLR, 2014.
- ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77(1):1–17, 2017. doi: 10.18637/jss.v077.i01.
- Bounds on the conditional and average treatment effect with unobserved confounding factors. arXiv preprint arXiv:1808.09521, 2018.
- Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation. arXiv preprint arXiv:2005.10579, 2020.
- Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 81(4):735–761, 2019.
- Angela Zhou. Optimal and fair encouragement policy evaluation and learning. Advances in Neural Information Processing Systems, 36, 2024.