Q-learning in Dynamic Treatment Regimes with Misclassified Binary Outcome (2404.04697v1)
Abstract: The study of precision medicine involves dynamic treatment regimes (DTRs), which are sequences of treatment decision rules recommended by taking patient-level information as input. The primary goal of the DTR study is to identify an optimal DTR, a sequence of treatment decision rules that leads to the best expected clinical outcome. Statistical methods have been developed in recent years to estimate an optimal DTR, including Q-learning, a regression-based method in the DTR literature. Although there are many studies concerning Q-learning, little attention has been given in the presence of noisy data, such as misclassified outcomes. In this paper, we investigate the effect of outcome misclassification on Q-learning and propose a correction method to accommodate the misclassification effect. Simulation studies are conducted to demonstrate the satisfactory performance of the proposed method. We illustrate the proposed method in two examples from the National Health and Nutrition Examination Survey Data I Epidemiologic Follow-up Study and the smoking cessation program.
- Bayesian set of best dynamic treatment regimes and sample size determination for smarts with binary outcomes. arXiv preprint arXiv:2008.02341.
- Measurement error in nonlinear models: a modern perspective. CRC press, Boca Raton, FL, USA.
- Chakraborty, B. (2013). Statistical methods for dynamic treatment regimes. Springer, New York, NY, USA.
- Inference for non-regular parameters in optimal dynamic treatment regimes. Statistical Methods in Medical Research, 19(3):317–343.
- Dynamic treatment regimes. Annual Review of Statistics and Its Application, 1:447–464.
- Theoretical statistics. CRC Press, Boca Raton, FL, USA.
- Comparison of dynamic treatment regimes with an ordinal outcome. arXiv preprint arXiv:1808.07287.
- Q-learning with censored data. Annals of Statistics, 40(1):529–560.
- Misclassification of the dependent variable in a discrete-response setting. Journal of Econometrics, 87(2):239–269.
- Regret-regression for optimal dynamic treatment regimes. Biometrics, 66(4):1192–1201.
- Causal inference: what if. Chapman & Hall/CRC, Boca Raton, FL, USA.
- Optimization of individualized dynamic treatment regimes for recurrent diseases. Statistics in Medicine, 33(14):2363–2378.
- The effectiveness of a perioperative smoking cessation program: a randomized clinical trial. Anesthesia & Analgesia, 117(3):605–613.
- Sensitivity analysis for misclassification in logistic regression via likelihood methods and predictive value weighting. Statistics in Medicine, 29(22):2297–2309.
- Validation data-based adjustments for outcome misclassification in logistic regression: an illustration. Epidemiology, 22(4):589–597.
- Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology, 146(2):195–203.
- Q-learning: Flexible learning about useful utilities. Statistics in Biosciences, 6(2):223–243.
- Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2):331–355.
- Neuhaus, J. M. (1999). Bias and efficiency loss due to misclassified responses in binary regression. Biometrika, 86(4):843–855.
- Pepe, M. S. (1992). Inference using surrogate outcome data and a validation sample. Biometrika, 79(2):355–365.
- Estimation and extrapolation of optimal treatment and testing strategies. Statistics in Medicine, 27(23):4678–4721.
- Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In Proceedings of the Second Seattle Symposium in Biostatistics, pages 189–326, Seattle, WA, USA. Springer.
- Q-and a-learning methods for estimating optimal dynamic treatment regimes. Statistical Science, 29(4):640–661.
- Causal inference with measurement error in outcomes: Bias analysis and estimation methods. Statistical Methods in Medical Research, 28(7):2049–2068.
- Estimating optimal dynamic treatment regimes with survival outcomes. Journal of the American Statistical Association, 115(531):1531–1539.
- Penalized q-learning for dynamic treatment regimens. Statistica Sinica, 25(3):901–920.
- Causal effect models for realistic individualized treatment and intention to treat rules. The International Journal of Biostatistics, 3(1):Article 3.
- Doubly-robust dynamic treatment regimen estimation via weighted least squares. Biometrics, 71(3):636–644.
- Watkins, C. J. C. H. (1989). Learning from delayed rewards. PhD thesis, King’s College, Cambridge, UK.
- Yi, G. Y. (2017). Statistical analysis with measurement error or misclassification: strategy, method and application. Springer, New York, NY, USA.