Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 105 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 41 tok/s
GPT-5 High 42 tok/s Pro
GPT-4o 104 tok/s
GPT OSS 120B 474 tok/s Pro
Kimi K2 256 tok/s Pro
2000 character limit reached

Two-Step Targeted Minimum-Loss Based Estimation for Non-Negative Two-Part Outcomes (2401.04263v2)

Published 8 Jan 2024 in stat.ME

Abstract: Non-negative two-part outcomes are defined as outcomes with a density function that have a zero point mass but are otherwise positive. Examples, such as healthcare expenditure and hospital length of stay, are common in healthcare utilization research. Despite the practical relevance of non-negative two-part outcomes, very few methods exist to leverage knowledge of their semicontinuity to achieve improved performance in estimating causal effects. In this paper, we develop a nonparametric two-step targeted minimum-loss based estimator (denoted as hTMLE) for non-negative two-part outcomes. We present methods for a general class of interventions referred to as modified treatment policies, which can accommodate continuous, categorical, and binary exposures. The two-step TMLE uses a targeted estimate of the intensity component of the outcome to produce a targeted estimate of the binary component of the outcome that may improve finite sample efficiency. We demonstrate the efficiency gains achieved by the two-step TMLE with simulated examples and then apply it to a cohort of Medicaid beneficiaries to estimate the effect of chronic pain and physical disability on days' supply of opioids.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. A marginalized two-part model for semicontinuous data. Statistics in medicine, 33(28):4891–4903, 2014.
  2. A method for analyzing longitudinal outcomes with many zeros. Mental Health Services Research, 6:239–246, 2004.
  3. A two-part random-effects model for semicontinuous longitudinal data. Journal of the American Statistical Association, 96(454):730–745, 2001.
  4. Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study. Statistical Methods in Medical Research, 25(1):133–152, 2016.
  5. A comparison of alternative models for the demand for medical care. Journal of business & economic statistics, 1(2):115–126, 1983.
  6. A bayesian nonparametric model for zero-inflated outcomes: Prediction, clustering, and causal estimation. Biometrics, 77(1):125–135, 2021. doi: https://doi.org/10.1111/biom.13244.
  7. A two-part framework for estimating individualized treatment rules from semicontinuous outcomes. Journal of the American Statistical Association, 116(533):210–223, 2021. doi: 10.1080/01621459.2020.1801449.
  8. Estimation of the effect of interventions that modify the received treatment. Statistics in medicine, 32(30):5260–5277, 2013.
  9. Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods, 3(1):1–19, 2014.
  10. Iván Díaz and Mark van der Laan. Population intervention causal effects based on stochastic interventions. Biometrics, 68(2):541–549, 2012.
  11. Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association, 118(542):846–857, 2023. doi: 10.1080/01621459.2021.1955691.
  12. Introducing longitudinal modified treatment policies: a unified framework for studying complex exposures, 2023a.
  13. Chris AJ Klaassen. Consistent estimation of the influence function of locally asymptotically linear estimators. The Annals of Statistics, 15(4):1548–1562, 1987.
  14. Wenjing Zheng and Mark J. van der Laan. Cross-Validated Targeted Minimum-Loss-Based Estimation, pages 459–474. Springer New York, New York, NY, 2011. ISBN 978-1-4419-9782-1. doi: 10.1007/978-1-4419-9782-1˙27.
  15. Double/debiased machine learning for treatment and structural parameters, 2018.
  16. Jin Qin. Inferences for case-control and semiparametric two-sample density ratio models. Biometrika, 85(3):619–630, 09 1998. doi: 10.1093/biomet/85.3.619.
  17. Semiparametric density estimation under a two-sample density ratio model. Bernoulli, 10(4):583–604, 2004.
  18. Robust variance estimation and inference for causal effect estimation. Journal of Causal Inference, 11(1):20210067, 2023.
  19. Jerome H. Friedman. Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1):1 – 67, 1991. doi: 10.1214/aos/1176347963.
  20. Super learner. Statistical Applications in Genetics & Molecular Biology, 6(25):Article 25, 2007.
  21. Independent and joint contributions of physical disability and chronic pain to incident opioid use disorder and opioid overdose among medicaid patients. Psychological Medicine, pages 1–12, 2023b.
  22. H2O AutoML: Scalable automatic machine learning. 7th ICML Workshop on Automated Machine Learning (AutoML), July 2020. URL https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf.
  23. h2o: R Interface for the ’H2O’ Scalable Machine Learning Platform, 2023. URL https://CRAN.R-project.org/package=h2o. R package version 3.42.0.2.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube