Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation (2404.03105v2)
Abstract: Mechanical ventilation is a critical life support intervention that delivers controlled air and oxygen to a patient's lungs, assisting or replacing spontaneous breathing. While several data-driven approaches have been proposed to optimize ventilator control strategies, they often lack interpretability and alignment with domain knowledge, hindering clinical adoption. This paper presents a methodology for interpretable reinforcement learning (RL) aimed at improving mechanical ventilation control as part of connected health systems. Using a causal, nonparametric model-based off-policy evaluation, we assess RL policies for their ability to enhance patient-specific outcomes-specifically, increasing blood oxygen levels (SpO2), while avoiding aggressive ventilator settings that may cause ventilator-induced lung injuries and other complications. Through numerical experiments on real-world ICU data from the MIMIC-III database, we demonstrate that our interpretable decision tree policy achieves performance comparable to state-of-the-art deep RL methods while outperforming standard behavior cloning approaches. The results highlight the potential of interpretable, data-driven decision support systems to improve safety and efficiency in personalized ventilation strategies, paving the way for seamless integration into connected healthcare environments.
- D. G. Blauvelt, H. S. Inany, J. M. Furlong-Dillard, D. K. Bailly, P. Oishi, M. A. Steurer, and M. Mahendra, “Association of ventilator settings with mortality in pediatric patients treated with extracorporeal life support for respiratory failure,” ASAIO Journal, vol. 68, p. 1536–1543, Mar. 2022.
- A. S. Slutsky and V. M. Ranieri, “Ventilator-induced lung injury,” New England J. Medicine, vol. 369, no. 22, pp. 2126–2136, 2013.
- R. Pinheiro de Oliveira, M. P. Hetzel, M. dos Anjos Silva, D. Dallegrave, and G. Friedman, “Mechanical ventilation with high tidal volume induces inflammation in patients without lung disease,” Critical Care, vol. 14, no. 2, p. R39, 2010.
- S. Rachmale, G. Li, G. Wilson, M. Malinchoc, and O. Gajic, “Practice of excessive FiO2 and effect on pulmonary outcomes in mechanically ventilated patients with acute lung injury,” Respiratory Care, vol. 57, no. 11, pp. 1887–1893, 2012.
- C. Yu, Y. Dong, J. Liu, and G. Ren, “Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV,” BMC Med. Inform. and Decis. Making, vol. 19, p. 60, Apr 2019.
- S. Parbhoo, J. Bogojeska, M. Zazzi, V. Roth, and F. Doshi-Velez, “Combining kernel and model based learning for HIV therapy selection,” AMIA Jt Summits Transl. Sci. Proc., vol. 2017, pp. 239–248, July 2017.
- M. Komorowski, L. A. Celi, O. Badawi, A. C. Gordon, and A. A. Faisal, “The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care,” Nature Medicine, vol. 24, pp. 1716–1720, Nov 2018.
- T. Nanayakkara, G. Clermont, C. J. Langmead, and D. Swigon, “Unifying cardiovascular modelling with deep reinforcement learning for uncertainty aware control of sepsis treatment,” PLOS Digital Health, vol. 1, pp. 1–20, 02 2022.
- A. Peine et al., “Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care,” npj Digital Medicine, vol. 4, no. 1, p. 32, 2021.
- F. Kondrup et al., “Towards safe mechanical ventilation treatment using deep offline reinforcement learning,” in Proc. AAAI Conf. Artif. Intell., vol. 37, pp. 15696–15702, Sep. 2023.
- M. Oroojeni Mohammad Javad, S. O. Agboola, K. Jethwani, A. Zeid, and S. Kamarthi, “A reinforcement learning–based method for management of type 1 diabetes: Exploratory study,” JMIR Diabetes, vol. 4, p. e12905, Aug. 2019.
- H. Emerson, M. Guy, and R. McConville, “Offline reinforcement learning for safer blood glucose control in people with type 1 diabetes,” J. Biomed. Inform., vol. 142, p. 104376, June 2023.
- C. Yu, J. Liu, S. Nemati, and G. Yin, “Reinforcement learning in healthcare: A survey,” ACM Comput. Surv., vol. 55, no. 1, 2021.
- O. Gottesman et al., “Evaluating reinforcement learning algorithms in observational health settings,” arXiv:1805.12298, 2018.
- A. M. Roth, N. Topin, P. Jamshidi, and M. Veloso, “Conservative Q-Improvement: Reinforcement learning for an interpretable decision-tree policy,” arXiv:1907.01180, 2019.
- A. S. Slutsky, “Mechanical ventilation,” Chest, vol. 104, no. 6, pp. 1833–1859, 1993.
- A. E. Johnson et al., “MIMIC-III, a freely accessible critical care database,” Scientific Data, vol. 3, no. 1, 2016.
- M. Oberst and D. Sontag, “Counterfactual off-policy evaluation with Gumbel-max structural causal models,” in Proc. Int. Conf. Mach. Learn., vol. 97, pp. 4881–4890, 2019.
- A. Bennett and N. Kallus, “Policy evaluation with latent confounders via optimal balance,” in Adv. Neural Inf. Process. Syst., vol. 32, 2019.
- M. A. Brookhart, T. Stürmer, R. J. Glynn, J. Rassen, and S. Schneeweiss, “Confounding control in healthcare database research: Challenges and potential approaches,” Medical Care, vol. 48, no. 6, p. S114–S120, 2010.
- G. Tusman, S. H. Bohm, and F. Suarez-Sipmann, “Advanced uses of pulse oximetry for monitoring mechanically ventilated patients,” Anesthesia & Analgesia, vol. 124, no. 1, 2017.
- L. M. Schnapp and N. H. Cohen, “Pulse oximetry: Uses and abuses,” Chest, vol. 98, no. 5, pp. 1244–1250, 1990.
- S. W. Salyer, “Chapter 15 - Pulmonary Emergencies,” in Essential Emergency Medicine (S. W. Salyer, ed.), pp. 844–913, Philadelphia: W.B. Saunders, 2007.
- G. F. Curley, J. G. Laffey, H. Zhang, and A. S. Slutsky, “Biotrauma and ventilator-induced lung injury,” Chest, vol. 150, p. 1109–1117, Nov. 2016.
- A. Kumar, A. Zhou, G. Tucker, and S. Levine, “Conservative Q-Learning for offline reinforcement learning,” in Adv. Neural Inf. Process. Syst., vol. 33, pp. 1179–1191, 2020.
- T. Seno and M. Imai, “d3rlpy: An offline deep reinforcement learning library,” J. Mach. Learn. Res., vol. 23, Oct. 2022.
- E. Nadaraya, “On estimating regression,” Theory of Probability and Its Applications, vol. 9, pp. 141–142, 1964.
- G. S. Watson, “Smooth regression analysis,” Sankhyā: The Indian J. Statist., Series A (1961-2002), vol. 26, no. 4, pp. 359–372, 1964.
- S. H. Loss et al., “The reality of patients requiring prolonged mechanical ventilation: a multicenter study,” Rev. Bras. Ter. Intensiva, vol. 27, no. 1, pp. 26–35, 2015.
- M. Girardis et al., “Effect of conservative vs conventional oxygen therapy on mortality among patients in an intensive care unit: The oxygen-icu randomized clinical trial,” JAMA, vol. 316, p. 1583, Oct. 2016.
- M. W. Semler et al., “Oxygen-saturation targets for critically ill adults receiving mechanical ventilation,” New England J. Medicine, vol. 387, p. 1759–1769, Nov. 2022.