DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency (2403.06197v1)
Abstract: The combination of electronic health records (EHR) and medical images is crucial for clinicians in making diagnoses and forecasting prognosis. Strategically fusing these two data modalities has great potential to improve the accuracy of machine learning models in clinical prediction tasks. However, the asynchronous and complementary nature of EHR and medical images presents unique challenges. Missing modalities due to clinical and administrative factors are inevitable in practice, and the significance of each data modality varies depending on the patient and the prediction target, resulting in inconsistent predictions and suboptimal model performance. To address these challenges, we propose DrFuse to achieve effective clinical multi-modal fusion. It tackles the missing modality issue by disentangling the features shared across modalities and those unique within each modality. Furthermore, we address the modal inconsistency issue via a disease-wise attention layer that produces the patient- and disease-wise weighting for each modality to make the final prediction. We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR. Experimental results show that the proposed method significantly outperforms the state-of-the-art models. Our implementation is publicly available at https://github.com/dorothy-yao/drfuse.
- Ahmad, J. 2016. The diabetic foot. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 10(1): 48–60.
- Diagnostic value of imaging modalities for COVID-19: scoping review. Journal of Medical Internet Research, 22(8): e19673.
- Epidemiology, diagnosis, and antimicrobial treatment of acute bacterial meningitis. Clinical Microbiology Reviews, 23(3): 467–492.
- Robust multimodal brain tumor segmentation via feature disentanglement and gated fusion. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22, 447–456. Springer.
- Multitask learning and benchmarking with clinical time series data. Scientific Data, 6(1): 96.
- MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images. In Proceedings of the 7th Machine Learning for Healthcare Conference, volume 182 of Proceedings of Machine Learning Research, 479–503. PMLR.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
- AugMix: A simple data processing method to improve robustness and uncertainty. In International Conference on Learning Representations.
- Pneumonia: update on diagnosis and management. BMJ, 332(7549): 1077–1079.
- Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digital Medicine, 3(1): 136.
- Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: a case-study in pulmonary embolism detection. Scientific Reports, 10(1): 22147.
- Semi-supervised multi-view deep discriminant representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(7): 2496–2509.
- MIMIC-IV, a freely accessible electronic health record dataset. Scientific Data, 10(1): 1.
- MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data, 6(1): 317.
- MMTM: Multimodal transfer module for CNN fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13289–13299.
- Multimodal machine learning in precision health: A scoping review. NPJ Digital Medicine, 5(1): 171.
- Multi-modality cardiac image computing: A survey. Medical Image Analysis, 102869.
- An empirical study of using radiology reports and images to improve ICU-mortality prediction. In 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI), 497–498. IEEE.
- SMIL: Multimodal learning with severely missing modality. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 2302–2310.
- Artificial intelligence-based methods for fusion of electronic health records and imaging data. Scientific Reports, 12(1): 17981.
- Combining 3D image and tabular data via the dynamic affine feature map transform. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24, 688–698. Springer.
- Missing MRI pulse sequence synthesis using multi-modal generative adversarial network. IEEE Transactions on Medical Imaging, 39(4): 1170–1183.
- Brain tumor segmentation on MRI with missing modalities. In Information Processing in Medical Imaging: 26th International Conference, IPMI 2019, Hong Kong, China, June 2–7, 2019, Proceedings 26, 417–428. Springer.
- Inconsistent matters: A knowledge-guided dual-consistency network for multi-modal rumor detection. IEEE Transactions on Knowledge and Data Engineering.
- Attention is all you need. Advances in Neural Information Processing Systems, 30.
- Multimodal deep learning models for early detection of Alzheimer’s disease stage. Scientific reports, 11(1): 3254.
- Multi-modal learning with missing modality via shared-specific feature modelling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15878–15887.
- TRIMOON: Two-round inconsistency-based multi-modal fusion network for fake news detection. Information Fusion, 93: 150–158.
- Deep learning of brain lesion patterns and user-defined clinical and MRI features for predicting conversion to multiple sclerosis from clinically isolated syndrome. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 7(3): 250–259.
- Pneumothorax: from definition to diagnosis and treatment. Journal of Thoracic Disease, 6(Suppl 4): S372.
- M3Care: Learning with missing modalities in multimodal healthcare data. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2418–2428.
- Modality-adaptive feature interaction for brain tumor segmentation with missing modalities. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 183–192. Springer.
- Multimodal diagnosis for pulmonary embolism from EHR data and CT images. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2053–2057. IEEE.
- Kejing Yin (4 papers)
- William K. Cheung (17 papers)
- Jia Liu (369 papers)
- Jing Qin (145 papers)
- Wenfang Yao (5 papers)