XAI for In-hospital Mortality Prediction via Multimodal ICU Data (2312.17624v1)
Abstract: Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Furthermore, we introduce an explainable method, namely Layer-Wise Propagation to Transformer, as a proper extension of the LRP method to Transformers, producing explanations over multimodal inputs and revealing the salient features attributed to prediction. Moreover, the contribution of each modality to clinical outcomes can be visualized, assisting clinicians in understanding the reasoning behind decision-making. We construct a multimodal dataset based on MIMIC-III and MIMIC-III Waveform Database Matched Subset. Comprehensive experiments on benchmark datasets demonstrate that our proposed framework can achieve reasonable interpretation with competitive prediction accuracy. In particular, our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
- H. Harutyunyan, H. Khachatrian, D. C. Kale, G. Ver Steeg, and A. Galstyan, “Multitask learning and benchmarking with clinical time series data,” Scientific data, vol. 6, no. 1, p. 96, 2019.
- L. Tzouvelekis, A. Markogiannakis, M. Psichogiou, P. Tassios, and G. Daikos, “Carbapenemases in klebsiella pneumoniae and other enterobacteriaceae: an evolving crisis of global dimensions,” Clinical microbiology reviews, vol. 25, no. 4, pp. 682–707, 2012.
- I. Silva, G. Moody, D. J. Scott, L. A. Celi, and R. G. Mark, “Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012,” in 2012 Computing in Cardiology. IEEE, 2012, pp. 245–248.
- Y. Feng, Z. Xu, L. Gan, N. Chen, B. Yu, T. Chen, and F. Wang, “Dcmn: Double core memory network for patient outcome prediction with multimodal data,” in 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 2019, pp. 200–209.
- Y. Xu, S. Biswal, S. R. Deshpande, K. O. Maher, and J. Sun, “Raim: Recurrent attentive and intensive model of multimodal patient monitoring data,” in Proceedings of the 24th ACM SIGKDD international conference on Knowledge Discovery & Data Mining, 2018, pp. 2565–2573.
- B. Nistal-Nuño, “Developing machine learning models for prediction of mortality in the medical intensive care unit,” Computer Methods and Programs in Biomedicine, vol. 216, p. 106663, 2022.
- W. Lyu, X. Dong, R. Wong, S. Zheng, K. Abell-Hart, F. Wang, and C. Chen, “A multimodal transformer: Fusing clinical notes with structured ehr data for interpretable in-hospital mortality prediction,” arXiv preprint arXiv:2208.10240, 2022.
- F. He and D. Tao, “Recent advances in deep learning theory,” arXiv preprint arXiv:2012.10931, 2020.
- A. Theissler, F. Spinnato, U. Schlegel, and R. Guidotti, “Explainable ai for time series classification: A review, taxonomy and research directions,” IEEE Access, 2022.
- S. T. Mueller, R. R. Hoffman, W. Clancey, A. Emrey, and G. Klein, “Explanation in human-ai systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable ai,” arXiv preprint arXiv:1902.01876, 2019.
- C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,” Nature machine intelligence, vol. 1, no. 5, pp. 206–215, 2019.
- A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins et al., “Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai,” Information fusion, vol. 58, pp. 82–115, 2020.
- A. Adadi and M. Berrada, “Peeking inside the black-box: a survey on explainable artificial intelligence (xai),” IEEE access, vol. 6, pp. 52 138–52 160, 2018.
- R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.
- R. Li, W. Xiao, L. Wang, H. Jang, and G. Carenini, “T3-vis: visual analytic for training and fine-tuning transformers in nlp,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2021, pp. 220–230.
- F. Di Martino and F. Delmastro, “Explainable ai for clinical and remote health applications: a survey on tabular and time series data,” Artificial Intelligence Review, pp. 1–55, 2022.
- T. Baltrušaitis, C. Ahuja, and L.-P. Morency, “Multimodal machine learning: A survey and taxonomy,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 2, pp. 423–443, 2018.
- Y. Wang, Y. Zhao, R. Callcut, and L. Petzold, “Integrating physiological time series and clinical notes with transformer for early prediction of sepsis,” arXiv preprint arXiv:2203.14469, 2022.
- A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. Anthony Celi, and R. G. Mark, “Mimic-iii, a freely accessible critical care database,” Scientific data, vol. 3, no. 1, pp. 1–9, 2016.
- A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215–e220, 2000.
- A. Johnson, T. Pollard, and R. Mark, “Mimic-iii clinical database (version 1.4),” PhysioNet. Available from: https://doi.org/10.13026/C2XW26., 2016.
- F. Viton, M. Elbattah, J.-L. Guérin, and G. Dequen, “Multi-channel convnet approach to predict the risk of in-hospital mortality for icu patients.” in DeLTA, 2020, pp. 98–102.
- E. Rocheteau, P. Liò, and S. Hyland, “Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit,” in Proceedings of the conference on health, inference, and learning, 2021, pp. 58–68.
- Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzel, “Learning to diagnose with lstm recurrent neural networks,” arXiv preprint arXiv:1511.03677, 2015.
- E. Choi, M. T. Bahadori, A. Schuetz, W. F. Stewart, and J. Sun, “Doctor ai: Predicting clinical events via recurrent neural networks,” in Machine learning for healthcare conference. PMLR, 2016, pp. 301–318.
- A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun et al., “Scalable and accurate deep learning with electronic health records,” NPJ digital medicine, vol. 1, no. 1, p. 18, 2018.
- H. Song, D. Rajan, J. Thiagarajan, and A. Spanias, “Attend and diagnose: Clinical time series analysis using attention models,” in Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018.
- T. Gangavarapu, A. Jayasimha, and G. S. Krishnan, “Tags: towards automated classification of unstructured clinical nursing notes,” in Natural Language Processing and Information Systems: 24th International Conference on Applications of Natural Language to Information Systems, NLDB 2019, Salford, UK, June 26–28, 2019, Proceedings 24. Springer, 2019, pp. 195–207.
- A. Rios and R. Kavuluru, “Convolutional neural networks for biomedical text classification: application in indexing biomedical articles,” in Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, 2015, pp. 258–267.
- Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” International journal of machine learning and cybernetics, vol. 1, pp. 43–52, 2010.
- J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- E. Alsentzer, J. R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, and M. McDermott, “Publicly available clinical bert embeddings,” arXiv preprint arXiv:1904.03323, 2019.
- A. Y. Hannun, P. Rajpurkar, M. Haghpanahi, G. H. Tison, C. Bourn, M. P. Turakhia, and A. Y. Ng, “Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network,” Nature medicine, vol. 25, no. 1, pp. 65–69, 2019.
- P. Nguyen, T. Tran, and S. Venkatesh, “Deep learning to attend to risk in icu,” arXiv preprint arXiv:1707.05010, 2017.
- S. N. Shukla and B. M. Marlin, “Integrating physiological time series and clinical notes with deep learning for improved icu mortality prediction,” arXiv preprint arXiv:2003.11059, 2020.
- S. Khadanga, K. Aggarwal, S. Joty, and J. Srivastava, “Using clinical notes with time series data for icu management,” arXiv preprint arXiv:1909.09702, 2019.
- I. Deznabi, M. Iyyer, and M. Fiterau, “Predicting in-hospital mortality by combining clinical notes with time-series data,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, pp. 4026–4031.
- J. Lötsch, D. Kringel, and A. Ultsch, “Explainable artificial intelligence (xai) in biomedicine: Making ai decisions trustworthy for physicians and patients,” BioMedInformatics, vol. 2, no. 1, pp. 1–17, 2022.
- S. Knapič, A. Malhi, R. Saluja, and K. Främling, “Explainable artificial intelligence for human decision support system in the medical domain,” Machine Learning and Knowledge Extraction, vol. 3, no. 3, pp. 740–770, 2021.
- S. Jana, T. Dasgupta, and L. Dey, “Predicting medical events and icu requirements using a multimodal multiobjective transformer network,” Experimental Biology and Medicine, vol. 247, no. 22, pp. 1988–2002, 2022.
- M. Mesinovic, P. Watkinson, and T. Zhu, “Xmi-icu: Explainable machine learning model for pseudo-dynamic prediction of mortality in the icu for heart attack patients,” arXiv preprint arXiv:2305.06109, 2023.
- H. Chefer, S. Gur, and L. Wolf, “Transformer interpretability beyond attention visualization,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 782–791.
- J. Duell, X. Fan, H. Fu, and M. Seisenberger, “Batch integrated gradients: Explanations for temporal electronic health records,” in International Conference on Artificial Intelligence in Medicine. Springer, 2023, pp. 120–124.
- S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PloS one, vol. 10, no. 7, p. e0130140, 2015.
- Y. Lee, E. Jun, J. Choi, and H.-I. Suk, “Multi-view integrative attention-based deep representation learning for irregular clinical time-series data,” IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 8, pp. 4270–4280, 2022.
- Z. Wang, J. Liu, Y. Tian, T. Zhou, Q. Liu, Y. Qiu, and J. Li, “Integrating medical domain knowledge for early diagnosis of fever of unknown origin: An interpretable hierarchical multimodal neural network approach,” IEEE Journal of Biomedical and Health Informatics, 2023.
- Y. Zhao, Q. Hong, X. Zhang, Y. Deng, Y. Wang, and L. Petzold, “Bertsurv: Bert-based survival models for predicting outcomes of trauma patients,” arXiv preprint arXiv:2103.10928, 2021.
- S. Abnar and W. Zuidema, “Quantifying attention flow in transformers,” arXiv preprint arXiv:2005.00928, 2020.
- J. Bastings and K. Filippova, “The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?” arXiv preprint arXiv:2010.05607, 2020.
- P. Atanasova, J. G. Simonsen, C. Lioma, and I. Augenstein, “A diagnostic study of explainability techniques for text classification,” arXiv preprint arXiv:2009.13295, 2020.
- A. Ali, T. Schnake, O. Eberle, G. Montavon, K.-R. Müller, and L. Wolf, “Xai for transformers: Better explanations through conservative propagation,” in International Conference on Machine Learning. PMLR, 2022, pp. 435–451.
- Z. C. Lipton, D. C. Kale, and R. Wetzel, “Modeling missing data in clinical time series with rnns,” 2016.
- H. Zhang, A. X. Lu, M. Abdalla, M. McDermott, and M. Ghassemi, “Hurtful words: quantifying biases in clinical contextual word embeddings,” in proceedings of the ACM Conference on Health, Inference, and Learning, 2020, pp. 110–120.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- Y. Kim, “Convolutional neural networks for sentence classification,” arXiv preprint arXiv:1408.5882, 2014.
- E. Voita, D. Talbot, F. Moiseev, R. Sennrich, and I. Titov, “Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned,” arXiv preprint arXiv:1905.09418, 2019.
- J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 233–240.
- L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers. Springer, 2010, pp. 177–186.
- F. He, T. Liu, and D. Tao, “Control batch size and learning rate to generalize well: Theoretical and empirical evidence,” Advances in neural information processing systems, vol. 32, 2019.