Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AI in Pharma for Personalized Sequential Decision-Making: Methods, Applications and Opportunities (2311.18725v1)

Published 30 Nov 2023 in stat.ME, cs.LG, stat.AP, and stat.ML

Abstract: In the pharmaceutical industry, the use of AI has seen consistent growth over the past decade. This rise is attributed to major advancements in statistical machine learning methodologies, computational capabilities and the increased availability of large datasets. AI techniques are applied throughout different stages of drug development, ranging from drug discovery to post-marketing benefit-risk assessment. Kolluri et al. provided a review of several case studies that span these stages, featuring key applications such as protein structure prediction, success probability estimation, subgroup identification, and AI-assisted clinical trial monitoring. From a regulatory standpoint, there was a notable uptick in submissions incorporating AI components in 2021. The most prevalent therapeutic areas leveraging AI were oncology (27%), psychiatry (15%), gastroenterology (12%), and neurology (11%). The paradigm of personalized or precision medicine has gained significant traction in recent research, partly due to advancements in AI techniques \cite{hamburg2010path}. This shift has had a transformative impact on the pharmaceutical industry. Departing from the traditional "one-size-fits-all" model, personalized medicine incorporates various individual factors, such as environmental conditions, lifestyle choices, and health histories, to formulate customized treatment plans. By utilizing sophisticated machine learning algorithms, clinicians and researchers are better equipped to make informed decisions in areas such as disease prevention, diagnosis, and treatment selection, thereby optimizing health outcomes for each individual.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. “Machine learning and artificial intelligence in pharmaceutical research and development: a review” In The AAPS Journal 24 Springer, 2022, pp. 1–10
  2. “Landscape analysis of the application of artificial intelligence and machine learning in regulatory submissions for drug development from 2016 to 2021” In Clinical pharmacology and therapeutics 113.4, 2023, pp. 771–774
  3. Margaret A Hamburg and Francis S Collins “The path to personalized medicine” In New England Journal of Medicine 363.4 Mass Medical Soc, 2010, pp. 301–304
  4. Bibhas Chakraborty and Erica E Moodie “Statistical methods for dynamic treatment regimes” In Springer-Verlag. doi 10.978-1 Springer, 2013, pp. 4–1
  5. Michael R Kosorok and Eric B Laber “Precision medicine” In Annual review of statistics and its application 6 Annual Reviews, 2019, pp. 263–286
  6. “Dynamic treatment regimes: Technical challenges and applications” In Electronic journal of statistics 8.1 NIH Public Access, 2014, pp. 1225
  7. “Q-learning: Theory and applications” In Annual Review of Statistics and Its Application 7 Annual Reviews, 2020, pp. 279–301
  8. SA Murphy “A Generalization Error for Q-Learning.” In Journal of Machine Learning Research: JMLR 6, 2005, pp. 1073–1097
  9. James M Robins “Optimal structural nested models for optimal sequential decisions” In Proceedings of the second seattle Symposium in Biostatistics, 2004, pp. 189–326 Springer
  10. Yufan Zhao, Michael R Kosorok and Donglin Zeng “Reinforcement learning design for cancer clinical trials” In Statistics in medicine 28.26 Wiley Online Library, 2009, pp. 3294–3315
  11. “Q-learning: A data analysis method for constructing adaptive interventions.” In Psychological methods 17.4 American Psychological Association, 2012, pp. 478
  12. Michael R Kosorok and Erica EM Moodie “Adaptive treatment strategies in practice: planning trials and analyzing data for personalized medicine” SIAM, 2015
  13. Susan A Murphy “Optimal dynamic treatment regimes” In Journal of the Royal Statistical Society Series B: Statistical Methodology 65.2 Oxford University Press, 2003, pp. 331–355
  14. “Q-and A-learning methods for estimating optimal dynamic treatment regimes” In Statistical science: a review journal of the Institute of Mathematical Statistics 29.4 NIH Public Access, 2014, pp. 640
  15. “Robust Q-learning” In Journal of the American Statistical Association 116.533 Taylor & Francis, 2021, pp. 368–381
  16. “Penalized q-learning for dynamic treatment regimens” In Statistica Sinica 25.3 NIH Public Access, 2015, pp. 901
  17. Bibhas Chakraborty, Eric B Laber and Ying-Qi Zhao “Inference about the expected performance of a data-driven dynamic treatment regime” In Clinical Trials 11.4 SAGE Publications Sage UK: London, England, 2014, pp. 408–417
  18. Thomas A Murray, Ying Yuan and Peter F Thall “A Bayesian machine learning approach for optimizing dynamic treatment regimes” In Journal of the American Statistical Association 113.523 Taylor & Francis, 2018, pp. 1255–1267
  19. “Multi-stage optimal dynamic treatment regimes for survival outcomes with dependent censoring” In Biometrika 110.2 Oxford University Press, 2023, pp. 395–410
  20. Wensheng Zhu, Donglin Zeng and Rui Song “Proper inference for value function in high-dimensional Q-learning for dynamic treatment regimes” In Journal of the American Statistical Association 114.527 Taylor & Francis, 2019, pp. 1404–1417
  21. “Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer” In Biometrics 67.4 Wiley Online Library, 2011, pp. 1422–1433
  22. “Reinforcement learning with action-derived rewards for chemotherapy and clinical trial dosing regimen selection” In Machine Learning for Healthcare Conference, 2018, pp. 161–226 PMLR
  23. “Machine learning for clinical trials in the era of COVID-19” In Statistics in biopharmaceutical research 12.4 Taylor & Francis, 2020, pp. 506–517
  24. “Reinforcement learning for intelligent healthcare applications: A survey” In Artificial Intelligence in Medicine 109 Elsevier, 2020, pp. 101964
  25. William H Press “Bandit solutions provide unified ethical models for randomized clinical trials and comparative effectiveness research” In Proceedings of the National Academy of Sciences 106.52 National Acad Sciences, 2009, pp. 22387–22392
  26. Sofı́a S Villar, Jack Bowden and James Wason “Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges” In Statistical science: a review journal of the Institute of Mathematical Statistics 30.2 Europe PMC Funders, 2015, pp. 199
  27. Jean-Yves Audibert, Rémi Munos and Csaba Szepesvári “Exploration–exploitation tradeoff using variance estimates in multi-armed bandits” In Theoretical Computer Science 410.19 Elsevier, 2009, pp. 1876–1902
  28. “Bandit algorithms” Cambridge University Press, 2020
  29. “Analysis of thompson sampling for the multi-armed bandit problem” In Conference on learning theory, 2012, pp. 39–1 JMLR WorkshopConference Proceedings
  30. “On upper-confidence bound policies for switching bandit problems” In International Conference on Algorithmic Learning Theory, 2011, pp. 174–188 Springer
  31. John White “Bandit algorithms for website optimization” " O’Reilly Media, Inc.", 2013
  32. “Comparing Epsilon greedy and Thompson sampling model for multi-armed bandit algorithm on marketing dataset” In Journal of Applied Data Sciences 2.2, 2021
  33. Shie Mannor and John N Tsitsiklis “The sample complexity of exploration in the multi-armed bandit problem” In Journal of Machine Learning Research 5.Jun, 2004, pp. 623–648
  34. Sandeep Pandey, Deepayan Chakrabarti and Deepak Agarwal “Multi-armed bandit problems with dependent arms” In Proceedings of the 24th international conference on Machine learning, 2007, pp. 721–728
  35. “Thompson sampling for contextual bandit problems with auxiliary safety constraints” In arXiv preprint arXiv:1911.00638, 2019
  36. “Learning for dose allocation in adaptive clinical trials with safety constraints” In International Conference on Machine Learning, 2020, pp. 8730–8740 PMLR
  37. Saba Q Yahyaa and Bernard Manderick “Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem.” In ESANN, 2015
  38. Ambuj Tewari and Susan A Murphy “From ads to interventions: Contextual bandits in mobile health” In Mobile health: sensors, analytic methods, and applications Springer, 2017, pp. 495–517
  39. “A contextual-bandit-based approach for informed decision-making in clinical trials” In Life 12.8 MDPI, 2022, pp. 1277
  40. John O’Quigley, Margaret Pepe and Lloyd Fisher “Continual reassessment method: a practical design for phase 1 clinical trials in cancer” In Biometrics JSTOR, 1990, pp. 33–48
  41. Beat Neuenschwander, Michael Branson and Thomas Gsponer “Critical aspects of the Bayesian approach to phase I cancer trials” In Statistics in medicine 27.13 Wiley Online Library, 2008, pp. 2420–2439
  42. Hongtao Zhang, Alan Y Chiang and Jixian Wang “Improving the performance of Bayesian logistic regression model with overdose control in oncology dose-finding studies” In Statistics in Medicine 41.27 Wiley Online Library, 2022, pp. 5463–5483
  43. “A Bayesian industry approach to phase I combination trials in oncology” In Statistical methods in drug combination studies 2015 Chapman & Hall/CRC Press: Boca Raton, FL, 2015, pp. 95–135
  44. Maryam Aziz, Emilie Kaufmann and Marie-Karelle Riviere “On multi-armed bandit designs for dose-finding clinical trials” In The Journal of Machine Learning Research 22.1 JMLRORG, 2021, pp. 686–723
  45. Lan Jin, Guodong Pang and Demissie Alemayehu “Multiarmed Bandit Designs for Phase I Dose-Finding Clinical Trials With Multiple Toxicity Types” In Statistics in Biopharmaceutical Research 15.1 Taylor & Francis, 2023, pp. 164–177
  46. “Mobile-health: A review of current state in 2015” In Journal of biomedical informatics 56 Elsevier, 2015, pp. 265–272
  47. James M Rehg, Susan A Murphy and Santosh Kumar “Mobile health” In Cham: Springer International Publishing Springer, 2017
  48. Richard S Sutton and Andrew G Barto “Reinforcement learning: An introduction” MIT press, 2018
  49. Martin L Puterman “Markov decision processes: discrete stochastic dynamic programming” John Wiley & Sons, 2014
  50. Scott Fujimoto, David Meger and Doina Precup “Off-policy deep reinforcement learning without exploration” In International conference on machine learning, 2019, pp. 2052–2062 PMLR
  51. Ashkan Ertefaie and Robert L Strawderman “Constructing dynamic treatment regimes over indefinite time horizons” In Biometrika 105.4 Oxford University Press, 2018, pp. 963–977
  52. “Estimating dynamic treatment regimes in mobile health using v-learning” In Journal of the American Statistical Association Taylor & Francis, 2019
  53. Christoph Dann, Gerhard Neumann and Jan Peters “Policy evaluation with temporal differences: A survey and comparison” In Journal of Machine Learning Research 15 Massachusetts Institute of Technology Press (MIT Press)/Microtome Publishing, 2014, pp. 809–883
  54. Wenzhuo Zhou, Ruoqing Zhu and Annie Qu “Estimating optimal infinite horizon dynamic treatment regimes via pt-learning” In Journal of the American Statistical Association Taylor & Francis, 2022, pp. 1–14
  55. Yuhan Li, Wenzhuo Zhou and Ruoqing Zhu “Quasi-optimal Reinforcement Learning with Continuous Actions” In The Eleventh International Conference on Learning Representations, 2022
  56. “Mobile health technology in the prevention and management of type 2 diabetes” In Indian journal of endocrinology and metabolism 21.2 Wolters Kluwer–Medknow Publications, 2017, pp. 334
  57. “The OhioT1DM dataset for blood glucose level prediction: Update 2020” In CEUR workshop proceedings 2675, 2020, pp. 71 NIH Public Access
  58. “Improving the estimation of mealtime insulin dose in adults with type 1 diabetes: the Normal Insulin Demand for Dose Adjustment (NIDDA) study” In Diabetes Care 34.10 Am Diabetes Assoc, 2011, pp. 2146–2151
  59. David Rodbard “Interpretation of continuous glucose monitoring data: glycemic variability and quality of glycemic control” In Diabetes technology & therapeutics 11.S1 Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA, 2009, pp. S–55
  60. Wang Miao, Xu Shi and Eric Tchetgen Tchetgen “A confounding bridge approach for double negative control inference on causal effects” In arXiv preprint arXiv:1808.04945, 2018
  61. “Semiparametric proximal causal inference” In Journal of the American Statistical Association Taylor & Francis, 2023, pp. 1–12
  62. “Confounding-robust policy improvement” In Advances in neural information processing systems 31, 2018
  63. Jiayi Wang, Zhengling Qi and Chengchun Shi “Blessing from experts: Super reinforcement learning in confounded environments” In arXiv preprint arXiv:2209.15448, 2022
  64. “A minimax learning approach to off-policy evaluation in confounded partially observable markov decision processes” In International Conference on Machine Learning, 2022, pp. 20057–20094 PMLR
  65. “Bellman-consistent pessimism for offline reinforcement learning” In Advances in neural information processing systems 34, 2021, pp. 6683–6694
  66. Ying Jin, Zhuoran Yang and Zhaoran Wang “Is pessimism provably efficient for offline rl?” In International Conference on Machine Learning, 2021, pp. 5084–5096 PMLR
  67. “Pessimistic model-based offline reinforcement learning under partial coverage” In arXiv preprint arXiv:2107.06226, 2021
  68. Kamyar Ghasemipour, Shixiang Shane Gu and Ofir Nachum “Why so pessimistic? estimating uncertainties for offline rl through ensembles, and why their independence matters” In Advances in Neural Information Processing Systems 35, 2022, pp. 18267–18281
  69. Masatoshi Uehara, Chengchun Shi and Nathan Kallus “A review of off-policy evaluation in reinforcement learning” In arXiv preprint arXiv:2212.06355, 2022
  70. Philip Thomas, Georgios Theocharous and Mohammad Ghavamzadeh “High-confidence off-policy evaluation” In Proceedings of the AAAI Conference on Artificial Intelligence 29.1, 2015
  71. “Non-asymptotic confidence intervals of off-policy evaluation: Primal and dual bounds” In arXiv preprint arXiv:2103.05741, 2021
  72. “Statistical inference of the value function for reinforcement learning in infinite-horizon settings” In Journal of the Royal Statistical Society Series B: Statistical Methodology 84.3 Oxford University Press, 2022, pp. 765–793
  73. “Does the markov decision process fit the data: Testing for the markov property in sequential decision making” In International Conference on Machine Learning, 2020, pp. 8807–8817 PMLR
  74. “Constructing dynamic treatment regimes with shared parameters for censored data” In Statistics in medicine 39.9 Wiley Online Library, 2020, pp. 1250–1263
  75. “Multicategory angle-based learning for estimating optimal dynamic treatment regimes with censored data” In Journal of the American Statistical Association 117.539 Taylor & Francis, 2022, pp. 1438–1451
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuhan Li (49 papers)
  2. Hongtao Zhang (17 papers)
  3. Keaven Anderson (3 papers)
  4. Songzi Li (18 papers)
  5. Ruoqing Zhu (23 papers)