Dynamic Treatment Regimes with Replicated Observations Available for Error-prone Covariates: a Q-learning Approach (2404.04696v1)
Abstract: Dynamic treatment regimes (DTRs) have received an increasing interest in recent years. DTRs are sequences of treatment decision rules tailored to patient-level information. The main goal of the DTR study is to identify an optimal DTR, a sequence of treatment decision rules that yields the best expected clinical outcome. Q-learning has been considered as one of the most popular regression-based methods to estimate the optimal DTR. However, it is rarely studied in an error-prone setting, where the patient information is contaminated with measurement error. In this paper, we study the effect of covariate measurement error on Q-learning and propose a correction method to correct the measurement error in Q-learning. Simulation studies are conducted to assess the performance of the proposed method in Q-learning. We illustrate the use of the proposed method in an application to the sequenced treatment alternatives to relieve depression data.
- Optimal dynamic regimes: presenting a case for predictive inference. The international journal of biostatistics, 6(2).
- Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine, 28(25):3158–3178.
- Measurement error in nonlinear models: a modern perspective. CRC press, Boca Raton, FL, USA.
- Chakraborty, B. (2013). Statistical methods for dynamic treatment regimes. Springer, New York, NY, USA.
- Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme. Biometrics, 69(3):714–723.
- Inference for non-regular parameters in optimal dynamic treatment regimes. Statistical Methods in Medical Research, 19(3):317–343.
- Dynamic treatment regimes. Annual Review of Statistics and Its Application, 1:447–464.
- Robust q-learning. Journal of the American Statistical Association, 116(533):368–381.
- Gray, C. M. (2018). Use of the Bayesian family of methods to correct for effects of exposure measurement error in polynomial regression models. PhD thesis, London School of Hygiene & Tropical Medicine.
- Defining and measuring functional: Recovery from depression. CNS drugs, 24:267–284.
- Regret-regression for optimal dynamic treatment regimes. Biometrics, 66(4):1192–1201.
- Patient-reported functioning in major depressive disorder. Therapeutic Advances in Chronic Disease, 7(3):160–169.
- A toolkit for measurement error correction, with a focus on nutritional epidemiology. Statistics in Medicine, 33(12):2137–2155.
- Interactive model building for q-learning. Biometrika, 101(4):831–847.
- Depression and pain impair daily functioning and quality of life in patients with major depressive disorder. Journal of Affective Disorders, 166:173–178.
- Q-learning for estimating optimal dynamic treatment rules from observational data. Canadian Journal of Statistics, 40(4):629–645.
- Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2):331–355.
- A bayesian machine learning approach for optimizing dynamic treatment regimes. Journal of the American Statistical Association, 113(523):1255–1267.
- Q-learning: a data analysis method for constructing adaptive interventions. Psychological methods, 17(4):478.
- Prentice, R. L. (1982). Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika, 69(2):331–342.
- Performance guarantees for individualized treatment rules. Annals of statistics, 39(2):1180.
- Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In Proceedings of the Second Seattle Symposium in Biostatistics, pages 189–326, Seattle, WA, USA. Springer.
- Sequenced treatment alternatives to relieve depression (STAR* D): rationale and design. Controlled Clinical Trials, 25(1):119–142.
- The 16-item quick inventory of depressive symptomatology (qids), clinician rating (qids-c), and self-report (qids-sr): a psychometric evaluation in patients with chronic major depression. Biological Psychiatry, 54(5):573–583.
- Q-and a-learning methods for estimating optimal dynamic treatment regimes. Statistical Science, 29(4):640–661.
- Measurement error and precision medicine: Error-prone tailoring covariates in dynamic treatment regimes. Statistics in Medicine, 39(26):3732–3755.
- Simulation-extrapolation: the measurement error jackknife. Journal of the American Statistical Association, 90(432):1247–1256.
- Reinforcement learning: An introduction. MIT press.
- Evaluating multiple treatment courses in clinical trials. Statistics in Medicine, 19(8):1011–1028.
- Increase in work productivity of depressed individuals with improvement in depressive symptom severity. American Journal of Psychiatry, 170(6):633–641.
- Doubly-robust dynamic treatment regimen estimation via weighted least squares. Biometrics, 71(3):636–644.
- Model selection for g-estimation of dynamic treatment regimes. Biometrics, 75(4):1205–1215.
- Watkins, C. J. C. H. (1989). Learning from delayed rewards. PhD thesis, King’s College, Cambridge, UK.
- Yi, G. Y. (2017). Statistical analysis with measurement error or misclassification: strategy, method and application. Springer, New York, NY, USA.
- Bayesian likelihood-based regression for estimation of optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkad016.
- Proper inference for value function in high-dimensional q-learning for dynamic treatment regimes. Journal of the American Statistical Association, 114(527):1404–1417.