Efficiency-improved doubly robust estimation with non-confounding predictive covariates
Abstract: In observational studies, covariates with substantial missing data are often omitted, despite their strong predictive capabilities. These excluded covariates are generally believed not to simultaneously affect both treatment and outcome, indicating that they are not genuine confounders and do not impact the identification of the average treatment effect (ATE). In this paper, we introduce an alternative doubly robust (DR) estimator that fully leverages non-confounding predictive covariates to enhance efficiency, while also allowing missing values in such covariates. Beyond the double robustness property, our proposed estimator is designed to be more efficient than the standard DR estimator. Specifically, when the propensity score model is correctly specified, it achieves the smallest asymptotic variance among the class of DR estimators, and brings additional efficiency gains by further integrating predictive covariates. Simulation studies demonstrate the notable performance of the proposed estimator over current popular methods. An illustrative example is provided to assess the effectiveness of right heart catheterization (RHC) for critically ill patients.
- Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360.
- Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973.
- Classification and Regression Trees. Wadsworth and Brooks.
- Variable selection for propensity score models. American Journal of Epidemiology, 163(12):1149–1156.
- Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika, 96(3):723–734.
- Double/debiased machine learning for treatment and structural parameters: Double/debiased machine learning. The Econometrics Journal, 21(1).
- Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1):266–298.
- The effectiveness of right heart catheterization in the initial care of critically ill patients. Journal of the American Medical Association, 276(11):889–897.
- Propensity score specification for optimal estimation of average treatment effect with binary response. Statistical Methods in Medical Research, 29(12):3623–3640.
- Semiparametric proximal causal inference. Journal of the American Statistical Association, DOI:10.1080/01621459.2023.2191817.
- Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika, 98(4):861–875.
- Regularized regression versus the high-dimensional propensity score for confounding adjustment in secondary database analyses. American Journal of Epidemiology, 182(7):651–659.
- Criteria for confounders in epidemiological studies. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(1):3–15.
- Hahn, J. (1998). On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, 66(2):315–332.
- Hahn, J. (2004). Functional restriction and efficiency in causal inference. The Review of Economics and Statistics, 86(1):73–76.
- Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology, 3(2):259–278.
- Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71(4):1161–1189.
- Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4):523–539.
- Comparing covariate prioritization via matching to machine learning methods for causal inference using five empirical applications. The American Statistician, 75(4):355–363.
- Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics, 54(3):948–963.
- Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in Medicine, 23(19):2937–2960.
- Improved doubly robust estimation in learning optimal individualized treatment rules. Journal of the American Statistical Association, 116(533):283–294.
- Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association, 89(427):846–866.
- Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Methodological), 45(2):212–218.
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688.
- Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American statistical Association, 91(434):473–489.
- Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association, 94(448):1096–1120.
- The calculus of m-estimation. The American Statistician, 56(1):29–38.
- Tan, Z. (2006). A distributional approach for causal inference using propensity scores. Journal of the American Statistical Association, 101(476):1619–1637.
- Ultra-high dimensional variable selection for doubly robust causal inference. Biometrics, 79(2):903–914.
- Bias-reduced doubly robust estimation. Journal of the American Statistical Association, 110(511):1024–1036.
- To adjust or not to adjust? estimating the average treatment effect in randomized experiments with missing covariates. Journal of the American Statistical Association, DOI:10.1080/01621459.2022.2123814.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.