Two-Step Targeted Minimum-Loss Based Estimation for Non-Negative Two-Part Outcomes (2401.04263v2)
Abstract: Non-negative two-part outcomes are defined as outcomes with a density function that have a zero point mass but are otherwise positive. Examples, such as healthcare expenditure and hospital length of stay, are common in healthcare utilization research. Despite the practical relevance of non-negative two-part outcomes, very few methods exist to leverage knowledge of their semicontinuity to achieve improved performance in estimating causal effects. In this paper, we develop a nonparametric two-step targeted minimum-loss based estimator (denoted as hTMLE) for non-negative two-part outcomes. We present methods for a general class of interventions referred to as modified treatment policies, which can accommodate continuous, categorical, and binary exposures. The two-step TMLE uses a targeted estimate of the intensity component of the outcome to produce a targeted estimate of the binary component of the outcome that may improve finite sample efficiency. We demonstrate the efficiency gains achieved by the two-step TMLE with simulated examples and then apply it to a cohort of Medicaid beneficiaries to estimate the effect of chronic pain and physical disability on days' supply of opioids.
- A marginalized two-part model for semicontinuous data. Statistics in medicine, 33(28):4891–4903, 2014.
- A method for analyzing longitudinal outcomes with many zeros. Mental Health Services Research, 6:239–246, 2004.
- A two-part random-effects model for semicontinuous longitudinal data. Journal of the American Statistical Association, 96(454):730–745, 2001.
- Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study. Statistical Methods in Medical Research, 25(1):133–152, 2016.
- A comparison of alternative models for the demand for medical care. Journal of business & economic statistics, 1(2):115–126, 1983.
- A bayesian nonparametric model for zero-inflated outcomes: Prediction, clustering, and causal estimation. Biometrics, 77(1):125–135, 2021. doi: https://doi.org/10.1111/biom.13244.
- A two-part framework for estimating individualized treatment rules from semicontinuous outcomes. Journal of the American Statistical Association, 116(533):210–223, 2021. doi: 10.1080/01621459.2020.1801449.
- Estimation of the effect of interventions that modify the received treatment. Statistics in medicine, 32(30):5260–5277, 2013.
- Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods, 3(1):1–19, 2014.
- Iván Díaz and Mark van der Laan. Population intervention causal effects based on stochastic interventions. Biometrics, 68(2):541–549, 2012.
- Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association, 118(542):846–857, 2023. doi: 10.1080/01621459.2021.1955691.
- Introducing longitudinal modified treatment policies: a unified framework for studying complex exposures, 2023a.
- Chris AJ Klaassen. Consistent estimation of the influence function of locally asymptotically linear estimators. The Annals of Statistics, 15(4):1548–1562, 1987.
- Wenjing Zheng and Mark J. van der Laan. Cross-Validated Targeted Minimum-Loss-Based Estimation, pages 459–474. Springer New York, New York, NY, 2011. ISBN 978-1-4419-9782-1. doi: 10.1007/978-1-4419-9782-1˙27.
- Double/debiased machine learning for treatment and structural parameters, 2018.
- Jin Qin. Inferences for case-control and semiparametric two-sample density ratio models. Biometrika, 85(3):619–630, 09 1998. doi: 10.1093/biomet/85.3.619.
- Semiparametric density estimation under a two-sample density ratio model. Bernoulli, 10(4):583–604, 2004.
- Robust variance estimation and inference for causal effect estimation. Journal of Causal Inference, 11(1):20210067, 2023.
- Jerome H. Friedman. Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1):1 – 67, 1991. doi: 10.1214/aos/1176347963.
- Super learner. Statistical Applications in Genetics & Molecular Biology, 6(25):Article 25, 2007.
- Independent and joint contributions of physical disability and chronic pain to incident opioid use disorder and opioid overdose among medicaid patients. Psychological Medicine, pages 1–12, 2023b.
- H2O AutoML: Scalable automatic machine learning. 7th ICML Workshop on Automated Machine Learning (AutoML), July 2020. URL https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf.
- h2o: R Interface for the ’H2O’ Scalable Machine Learning Platform, 2023. URL https://CRAN.R-project.org/package=h2o. R package version 3.42.0.2.
Collections
Sign up for free to add this paper to one or more collections.