Extention of Bagging MARS with Group LASSO for Heterogeneous Treatment Effect Estimation (2402.14282v1)
Abstract: Recent years, large scale clinical data like patient surveys and medical record data are playing an increasing role in medical data science. These large-scale clinical data, collectively referred to as "real-world data (RWD)". It is expected to be widely used in large-scale observational studies of specific diseases, personal medicine or precise medicine, finding the responder of drugs or treatments. Applying RWD for estimating heterogeneous treat ment effect (HTE) has already been a trending topic. HTE has the potential to considerably impact the development of precision medicine by helping doctors make more informed precise treatment decisions and provide more personalized medical care. The statistical models used to estimate HTE is called treatment effect models. Powers et al. proposed a some treatment effect models for observational study, where they pointed out that the bagging causal MARS (BCM) performs outstanding compared to other models. While BCM has excellent performance, it still has room for improvement. In this paper, we proposed a new treatment effect model called shrinkage causal bagging MARS method to improve their shared basis conditional mean regression framework based on the following points: first, we estimated basis functions using transformed outcome, then applied the group LASSO method to optimize the model and estimate parameters. Besides, we are focusing on pursing better interpretability of model to improve the ethical acceptance. We designed simulations to verify the performance of our proposed method and our proposed method superior in mean square error and bias in most simulation settings. Also we applied it to real data set ACTG 175 to verify its usability, where our results are supported by previous studies.
- Some methods for heterogeneous treatment effect estimation in high dimensions. Statistics in Medicine, 37, 2018.
- Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 68, 2006.
- Beth Schurman. The framework for fda’s real-world evidence program. Applied Clinical Trials, 28(4), 2019.
- Small but mighty: The use of real-world evidence to inform precision medicine. Clinical Pharmacology and Therapeutics, 106, 2019.
- Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. The Milbank Quarterly, 82(4):661–687, 2004.
- Translating evidence into practice: eligibility criteria fail to eliminate clinically significant differences between real-world and study populations. npj Digital Medicine, 3, 2020.
- Heterogeneity of treatment effects: implications for guidelines, payment, and quality assessment. The American journal of medicine, 120(4):S3–S9, 2007.
- Real-world evidence — what is it and what can it tell us? New England Journal of Medicine, 375, 2016.
- Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. Bmj, 363, 2018.
- From real-world patient data to individualized treatment effects using machine learning: current and future methods to address underlying challenges. Clinical Pharmacology & Therapeutics, 109(1):87–100, 2021.
- A framework for the analysis of heterogeneity of treatment effect in patient-centered outcomes research. Journal of clinical epidemiology, 66(8):818–825, 2013.
- Jerzy Neyman. Sur les applications de la thar des probabilities aux experiences agaricales: Essay des principle. excerpts reprinted (1990) in english. Statistical Science, 5(463-472):4, 1923.
- Donald B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 66(5):688, 1974.
- The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 1983.
- Judea Pearl. Causality. Cambridge university press, 2009.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences of the United States of America, 116, 2019.
- Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences of the United States of America, 113, 2016.
- A simple method for estimating interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109(508):1517–1532, 2014.
- Machine learning methods for estimating heterogeneous causal effects. stat, 1050(5):1–26, 2015.
- Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113, 2018.
- Jennifer L Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
- Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases. Statistics in medicine, 37(23):3309–3324, 2018.
- Anning Hu. Heterogeneous treatment effects analysis for social scientists: A review. Social Science Research, 109:102810, 2023.
- Causal inference and uplift modelling: A review of the literature. In International conference on predictive applications and APIs, pages 1–13. PMLR, 2017.
- Paul R Rosenbaum. Model-based direct adjustment. Journal of the American statistical Association, 82(398):387–394, 1987.
- An introduction to inverse probability of treatment weighting in observational research, 2022.
- Jerome H. Friedman. Multivariate adaptive regression splines. The Annals of Statistics, 19, 1991.
- Classification and regression trees. CRC press, 1984.
- An ensemble learning algorithm based on lasso selection. In 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, volume 1, pages 617–620. IEEE, 2010.
- Ariel Linden. Improving causal inference with a doubly robust estimator that combines propensity score stratification and weighting. Journal of Evaluation in Clinical Practice, 23, 2017.
- The pros and cons of propensity scores, 2012.
- Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.
- Interpretable machine learning–a brief history, state-of-the-art and challenges. In Joint European conference on machine learning and knowledge discovery in databases, pages 417–431. Springer, 2020.
- Ensembles for feature selection: A review and future trends. Information Fusion, 52:1–12, 2019.
- Robi Polikar. Ensemble learning. Ensemble machine learning: Methods and applications, pages 1–34, 2012.
- Propensity score and proximity matching using random forest. Contemporary Clinical Trials, 47, 2016.
- A trial comparing nucleoside monotherapy with combination therapy in hiv-infected adults with cd4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335, 1996.
- Leo Breiman. Random forests. Machine Learning, 45, 2001.
- Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 2001.
- Nonparametric machine learning for precision medicine with longitudinal clinical trials and bayesian additive regression trees with mixed models. Statistics in Medicine, 40, 2021.
- Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Statistics in Medicine, 27, 2008.