Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects (2403.03240v1)
Abstract: This study investigates the estimation and the statistical inference about Conditional Average Treatment Effects (CATEs), which have garnered attention as a metric representing individualized causal effects. In our data-generating process, we assume linear models for the outcomes associated with binary treatments and define the CATE as a difference between the expected outcomes of these linear models. This study allows the linear models to be high-dimensional, and our interest lies in consistent estimation and statistical inference for the CATE. In high-dimensional linear regression, one typical approach is to assume sparsity. However, in our study, we do not assume sparsity directly. Instead, we consider sparsity only in the difference of the linear models. We first use a doubly robust estimator to approximate this difference and then regress the difference on covariates with Lasso regularization. Although this regression estimator is consistent for the CATE, we further reduce the bias using the techniques in double/debiased machine learning (DML) and debiased Lasso, leading to $\sqrt{n}$-consistency and confidence intervals. We refer to the debiased estimator as the triple/debiased Lasso (TDL), applying both DML and debiased Lasso techniques. We confirm the soundness of our proposed method through simulation studies.
- Estimating conditional average treatment effects. Journal of Business & Economic Statistics, 33(4):485–505, 2015.
- Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet (London, England), 355(9209):1064–1069, 2000.
- Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973, 2005.
- Benign overfitting in linear regression. Proceedings of the National Academy of Sciences (PNAS), 117(48):30063–30070, 2020.
- Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems. Biometrika, 102(1):77–94, 2014a.
- High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives, 28(2):29–50, 2014b.
- Post-selection inference for generalized linear models with many controls. Journal of Business & Economic Statistics, 34(4):606–619, 2016.
- Counterfactual reasoning and learning systems: The example of computational advertising. Journal of Machine Learning Research, 14(65):3207–3260, 2013.
- Peter Bühlmann and Sara van de Geer. Statistics for high-dimensional data. Springer Series in Statistics. Springer, 2011.
- Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. The Annals of Statistics, 45(2):615 – 646, 2017.
- Optimal statistical inference for individualized treatment effects in high-dimensional models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83(4):669–719, 2021.
- Double/debiased machine learning for treatment and structural parameters. Econometrics Journal, 21:C1–C68, 2018.
- Applied Causal Inference Powered by ML and AI. 2024. URL https://causalml-book.org/.
- Estimation of conditional average treatment effects with high-dimensional data. Journal of Business & Economic Statistics, 40(1):313–327, 2022.
- Jinyong Hahn. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, 66(2):315–331, 1998.
- Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion). Bayesian Analysis, 15(3):965 – 1056, 2020.
- Structural equations, treatment effects, and econometric policy evaluation. Econometrica, 73(3):669–738, 2005.
- Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. The Review of Economic Studies, 64(4):605–654, 1997.
- Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67, 1970.
- Yu-Chin Hsu. Consistent tests for conditional treatment effects. The Econometrics Journal, 20(1):1–22, 2017.
- Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, 2015.
- Confidence intervals and hypothesis testing for high-dimensional regression. Journal of Machine Learning Research, 15(82):2869–2909, 2014.
- Debiasing the lasso: Optimal sample size for Gaussian designs. The Annals of Statistics, 46(6A):2593 – 2622, 2018.
- Sensitivity analysis of individual treatment effects: A robust conformal inference approach. Proceedings of the National Academy of Sciences (PNAS), 120(6):e2214889120, 2023.
- Nathan Kallus. Recursive partitioning for personalization using observational data. In International Conference on Machine Learning, volume 70, pp. 1789–1798, 2017.
- Double reinforcement learning for efficient off-policy evaluation in markov decision processes. Journal of Machine Learning Research, 2020.
- Benign-overfitting in conditional average treatment effect prediction with linear regression, 2022. arXiv:2202.05245.
- Cate lasso: Conditional average treatment effect estimation with high-dimensional linear regression, 2023. arXiv:2310.16819.
- Chris A. J. Klaassen. Consistent estimation of the influence function of locally asymptotically linear estimators. Annals of Statistics, 15, 1987.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences (PNAS), 116(10):4156–4165, 2019.
- Causal q-aggregation for cate model selection, 2023. arXiv:2310.16945.
- Nonparametric Tests of Conditional Treatment Effects. Technical Report 1740, Cowles Foundation for Research in Economics, Yale University, 2009.
- Doubly robust uniform confidence band for the conditional average treatment effect function. Journal of Applied Econometrics, 32(7):1207–1225, 2017.
- Matching on balanced nonlinear representations for treatment effects estimation. In Conference on Neural Information Processing Systems (NeurIPS), volume 30. Curran Associates, Inc., 2017.
- High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics, 34(3):1436 – 1462, 2006.
- Jerzy Neyman. Sur les applications de la theorie des probabilites aux experiences agricoles: Essai des principes. Statistical Science, 5:463–472, 1923.
- Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108, 2020.
- Selective inference in propensity score analysis, 2021.
- Some methods for heterogeneous treatment effect estimation in high-dimensions. Statistics in Medicine, 37, 2017.
- Restricted eigenvalue properties for correlated gaussian designs. Journal of Machine Learning Research, 11(78):2241–2259, 2010.
- The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55, 1983.
- Donald B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 1974.
- Counterfactual normalization: Proactively addressing dataset shift using causal mechanisms. In Conference on Uncertainty in Artificial Intelligence 2018, UAI 2018, volume 2, pp. 947–957. AUAI, 2018.
- Bayesian causal synthesis for supra-inference on heterogeneous treatment effects, 2023.
- Combining doubly robust methods and machine learning for estimating average treatment effects for observational real-world data, 2024. arXiv:2204.10969.
- Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B), 58:267–288, 1996.
- Benign overfitting in ridge regression. Journal of Machine Learning Research, 24(123):1–76, 2023.
- On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3):1166 – 1202, 2014.
- Sara A. van de Geer. High-dimensional generalized linear models and the lasso. The Annals of Statistics, 36(2):614 – 645, 2008.
- Mark J. van der Laan. Statistical inference for variable importance. The International Journal of Biostatistics, 2(1), 2006. URL https://doi.org/10.2202/1557-4679.1008.
- Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
- Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 76(1):217–242, 2014.
- On model selection consistency of lasso. Journal of Machine Learning Research, 7(90):2541–2563, 2006.
- Qingyuan Zhao. Covariate balancing propensity score by tailored loss functions. The Annals of Statistics, 47(2):965 – 993, 2019.
- Wenjing Zheng and Mark J van der Laan. Cross-validated targeted minimum-loss-based estimation. In Targeted Learning: Causal Inference for Observational and Experimental Data. Springer, 2011.
- Shuheng Zhou. Restricted eigenvalue conditions on subgaussian random matrices, 2009. arXiv:0912.4045.
- Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2):301–320, 2005.