Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer (2403.07309v1)

Published 12 Mar 2024 in cs.LG, cs.AI, and cs.CY

Abstract: Sepsis, a life-threatening condition triggered by the body's exaggerated response to infection, demands urgent intervention to prevent severe complications. Existing machine learning methods for managing sepsis struggle in offline scenarios, exhibiting suboptimal performance with survival rates below 50%. This paper introduces the POSNEGDM -- ``Reinforcement Learning with Positive and Negative Demonstrations for Sequential Decision-Making" framework utilizing an innovative transformer-based model and a feedback reinforcer to replicate expert actions while considering individual patient characteristics. A mortality classifier with 96.7\% accuracy guides treatment decisions towards positive outcomes. The POSNEGDM framework significantly improves patient survival, saving 97.39% of patients, outperforming established machine learning algorithms (Decision Transformer and Behavioral Cloning) with survival rates of 33.4% and 43.5%, respectively. Additionally, ablation studies underscore the critical role of the transformer-based decision maker and the integration of a mortality classifier in enhancing overall survival rates. In summary, our proposed approach presents a promising avenue for enhancing sepsis treatment outcomes, contributing to improved patient care and reduced healthcare costs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Invariant causal imitation learning for generalizable policies. Advances in Neural Information Processing Systems, 34:3952–3964, 2021.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  3. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
  4. Robotic behavioral cloning through task building. In 2020 International Conference on Information and Communication Technology Convergence (ICTC), pages 1279–1281. IEEE, 2020.
  5. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock 2021. Intensive care medicine, 47(11):1181–1247, 2021.
  6. Incidence and mortality of hospital- and ICU-treated sepsis: results from an updated and expanded systematic review and meta-analysis. Intensive Care Med., 46(8):1552–1562, August 2020.
  7. What is sepsis?, 2023. Accessed: 2023-07-13.
  8. Learning robust rewards with adverserial inverse reinforcement learning. In International Conference on Learning Representations, 2017.
  9. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870, 2018.
  10. Borderline-smote: A new over-sampling method in imbalanced data sets learning. In Advances in Intelligent Computing, pages 878–887, 2005.
  11. Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
  12. Imitation learning: A survey of learning methods. ACM Computing Surveys, 50(2):1–35, 2017.
  13. Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
  14. A conservative q-learning approach for handling distribution shift in sepsis treatment strategies. arXiv preprint arXiv:2203.13884, 2022.
  15. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
  16. The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nature medicine, 24(11):1716–1720, 2018.
  17. Deep learning-based prediction of mechanical ventilation reintubation in intensive care units. In City, Society, and Digital Transformation: Proceedings of the 2022 INFORMS International Conference on Service Science, pages 15–22, 2022.
  18. Improving sepsis treatment strategies by combining deep and kernel-based reinforcement learning. In AMIA Annual Symposium Proceedings, volume 2018, page 887, 2018.
  19. A survey on offline reinforcement learning: Taxonomy, review, and open problems. IEEE Transactions on Neural Networks and Learning Systems, 2023.
  20. Continuous state-space models for optimal sepsis treatment: a deep reinforcement learning approach. In Machine Learning for Healthcare Conference, pages 147–163, 2017.
  21. Model-based reinforcement learning for sepsis treatment. arXiv preprint arXiv:1811.09602, 2018.
  22. Subcutaneous insulin administration by deep reinforcement learning for blood glucose level control of type-2 diabetic patients. Computers in Biology and Medicine, 148:105860, 2022.
  23. Challenges and solutions in translating sepsis guidelines into practice in resource-limited settings. Translational Pediatrics, 10(10):2646, 2021.
  24. Heterogeneity in clinical presentations of sepsis: Challenges and implications for “one-size-fits-all” time-to-antibiotic measures. Critical Care Medicine, 50(5):886–889, 2022.
  25. Robust behavioral cloning for autonomous vehicles using end-to-end imitation learning. SAE International Journal of Connected and Automated Vehicles, 4(12-04-03-0023), 2021.
  26. Learning to fly. In Proceedings of the 9th International Workshop on Machine Learning, pages 385–393, 1992.
  27. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  28. Learning and assessing optimal dynamic treatment regimes through cooperative imitation learning. IEEE Access, 10:78148–78158, 2022.
  29. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  30. Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. PloS one, 14(6):e0217541, 2019.
  31. Adversarial cooperative imitation learning for dynamic treatment regimes. In Proceedings of The Web Conference 2020, pages 1785–1795, 2020.
  32. Learning optimal treatment strategies for sepsis using offline reinforcement learning in continuous space. In International Conference on Health Information Science, pages 113–124, 2022.

Summary

We haven't generated a summary for this paper yet.