Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Attention Q-Network for Personalized Treatment Recommendation (2307.01519v1)

Published 4 Jul 2023 in cs.LG and cs.AI

Abstract: Tailoring treatment for individual patients is crucial yet challenging in order to achieve optimal healthcare outcomes. Recent advances in reinforcement learning offer promising personalized treatment recommendations; however, they rely solely on current patient observations (vital signs, demographics) as the patient's state, which may not accurately represent the true health status of the patient. This limitation hampers policy learning and evaluation, ultimately limiting treatment effectiveness. In this study, we propose the Deep Attention Q-Network for personalized treatment recommendations, utilizing the Transformer architecture within a deep reinforcement learning framework to efficiently incorporate all past patient observations. We evaluated the model on real-world sepsis and acute hypotension cohorts, demonstrating its superiority to state-of-the-art models. The source code for our model is available at https://github.com/stevenmsm/RL-ICU-DAQN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  2. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
  3. An adaptive neural network filter for improved patient state estimation in closed-loop anesthesia control. In 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, pages 41–46. IEEE, 2011.
  4. Feasibility and acceptability of home use of a smartphone-based urine testing application among women in prenatal care. American Journal of Obstetrics & Gynecology, 221(5):527–528, 2019.
  5. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
  6. Doctor ai: Predicting clinical events via recurrent neural networks. In Machine learning for healthcare conference, pages 301–318. PMLR, 2016.
  7. Sepsis: a roadmap for future research. The Lancet infectious diseases, 15(5):581–614, 2015.
  8. Unexplained mortality differences between septic shock trials: a systematic analysis of population characteristics and control-group mortality rates. Intensive care medicine, 44:311–322, 2018.
  9. Deep transformer q-networks for partially observable reinforcement learning. arXiv preprint arXiv:2206.01078, 2022.
  10. Automatic treatment planning based on three-dimensional dose distribution predicted from deep learning technique. Medical physics, 46(1):370–381, 2019.
  11. Serial evaluation of the sofa score to predict outcome in critically ill patients. Jama, 286(14):1754–1758, 2001.
  12. Personalized medication dosing using volatile data streams. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  13. Evaluating reinforcement learning algorithms in observational health settings. arXiv preprint arXiv:1805.12298, 2018.
  14. Interpretable off-policy evaluation in reinforcement learning by highlighting influential transitions. In International Conference on Machine Learning, pages 3658–3667. PMLR, 2020.
  15. Deep recurrent q-learning for partially observable mdps. In 2015 aaai fall symposium series, 2015.
  16. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  17. Reinforcement learning for sepsis treatment: A continuous action space solution. In Machine Learning for Healthcare Conference, pages 1–17. PMLR, 2022.
  18. Deep variational reinforcement learning for pomdps. In International Conference on Machine Learning, pages 2117–2126. PMLR, 2018.
  19. Doubly robust off-policy value evaluation for reinforcement learning. In International Conference on Machine Learning, pages 652–661. PMLR, 2016.
  20. Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
  21. Emergency department hypotension predicts sudden unexpected in-hospital mortality: a prospective cohort study. Chest, 130(4):941–946, 2006.
  22. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
  23. The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nature medicine, 24(11):1716–1720, 2018.
  24. Negative trials in critical care: why most research is probably wrong. The Lancet Respiratory Medicine, 6(9):659–660, 2018.
  25. The sofa score—development, utility and challenges of accurate assessment in clinical trials. Critical Care, 23(1):1–9, 2019.
  26. Textray: Mining clinical reports to gain a broad understanding of chest x-rays. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 553–561. Springer, 2018.
  27. Optimizing sequential medical treatments with auto-encoding heuristic search in pomdps. arXiv preprint arXiv:1905.07465, 2019.
  28. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
  29. A deep deterministic policy gradient approach to medication dosing and surveillance in the icu. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 4927–4931. IEEE, 2018.
  30. Prognostic accuracy of the serum lactate level, the sofa score and the qsofa score for mortality among adults with sepsis. Scandinavian journal of trauma, resuscitation and emergency medicine, 27:1–10, 2019.
  31. Deep reinforcement learning for optimal critical care pain management with morphine using dueling double-deep q networks. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 3960–3963. IEEE, 2019.
  32. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  33. P Read Montague. Reinforcement learning: an introduction, by sutton, rs and barto, ag. Trends in cognitive sciences, 3(9):360, 1999.
  34. Lena M Napolitano. Sepsis 2018: definitions and guideline changes. Surgical infections, 19(2):117–125, 2018.
  35. Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach. In 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pages 2978–2981. IEEE, 2016.
  36. Closed-loop control of anesthesia and mean arterial pressure using reinforcement learning. Biomedical Signal Processing and Control, 22:54–64, 2015.
  37. Improving sepsis treatment strategies by combining deep and kernel-based reinforcement learning. In AMIA Annual Symposium Proceedings, volume 2018, page 887. American Medical Informatics Association, 2018.
  38. A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. arXiv preprint arXiv:1704.06300, 2017.
  39. Deep reinforcement learning for sepsis treatment. arXiv preprint arXiv:1711.09602, 2017a.
  40. Continuous state-space models for optimal sepsis treatment: a deep reinforcement learning approach. In Machine Learning for Healthcare Conference, pages 147–163. PMLR, 2017b.
  41. Model-based reinforcement learning for sepsis treatment. arXiv preprint arXiv:1811.09602, 2018.
  42. Prioritized experience replay. arXiv preprint arXiv:1511.05952, 2015.
  43. The third international consensus definitions for sepsis and septic shock (sepsis-3). Jama, 315(8):801–810, 2016.
  44. Data-efficient off-policy policy evaluation for reinforcement learning. In International Conference on Machine Learning, pages 2139–2148. PMLR, 2016.
  45. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016.
  46. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  47. The sofa (sepsis-related organ failure assessment) score to describe organ dysfunction/failure, 1996.
  48. Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2447–2456, 2018.
  49. Dueling network architectures for deep reinforcement learning. In International conference on machine learning, pages 1995–2003. PMLR, 2016.
  50. Improved cancer detection using artificial intelligence: a retrospective evaluation of missed cancers on mammography. Journal of digital imaging, 32(4):625–637, 2019.
  51. Q-learning. Machine learning, 8(3):279–292, 1992.
  52. Representation and reinforcement learning for personalized glycemic control in septic patients. arXiv preprint arXiv:1712.00654, 2017.
  53. Solving deep memory pomdps with recurrent policy gradients. In International conference on artificial neural networks, pages 697–706. Springer, 2007.
  54. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
  55. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32, 2019.
  56. Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units. BMC medical informatics and decision making, 19(2):111–120, 2019.
  57. Navigation in unknown dynamic environments based on deep reinforcement learning. Sensors, 19(18):3837, 2019.
Citations (4)

Summary

We haven't generated a summary for this paper yet.