Next Visit Diagnosis Prediction via Medical Code-Centric Multimodal Contrastive EHR Modelling with Hierarchical Regularisation (2401.11648v5)
Abstract: Predicting next visit diagnosis using Electronic Health Records (EHR) is an essential task in healthcare, critical for devising proactive future plans for both healthcare providers and patients. Nonetheless, many preceding studies have not sufficiently addressed the heterogeneous and hierarchical characteristics inherent in EHR data, inevitably leading to sub-optimal performance. To this end, we propose NECHO, a novel medical code-centric multimodal contrastive EHR learning framework with hierarchical regularisation. First, we integrate multifaceted information encompassing medical codes, demographics, and clinical notes using a tailored network design and a pair of bimodal contrastive losses, all of which pivot around a medical codes representation. We also regularise modality-specific encoders using a parental level information in medical ontology to learn hierarchical structure of EHR data. A series of experiments on MIMIC-III data demonstrates effectiveness of our approach.
- Publicly available clinical bert embeddings. arXiv preprint arXiv:1904.03323.
- Main: Multimodal attention-based fusion networks for diagnosis prediction. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 809–816. IEEE.
- Layer normalization. arXiv preprint arXiv:1607.06450.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR.
- Doctor ai: Predicting clinical events via recurrent neural networks. In Machine learning for healthcare conference, pages 301–318. PMLR.
- Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 787–795.
- Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. Advances in neural information processing systems, 29.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
- Characterizing the value of information in medical notes. arXiv preprint arXiv:2010.03574.
- Tefna: Text-centered fusion network with crossmodal attention for multimodal sentiment analysis. Knowledge-Based Systems, 269:110502.
- On the importance of clinical notes in multi-modal learning for ehr data. arXiv preprint arXiv:2212.03044.
- Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9.
- Using clinical notes with time series data for icu management. arXiv preprint arXiv:1909.09702.
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1754–1763.
- Efficient low-rank multimodal fusion with modality-specific factors. arXiv preprint arXiv:1806.00064.
- Collaborative graph learning with auxiliary text for temporal event prediction in healthcare. arXiv preprint arXiv:2105.07542.
- Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1903–1911.
- Kame: Knowledge-based attention model for diagnosis prediction in healthcare. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 743–752.
- Hybrid contrastive learning of tri-modal representation for multimodal sentiment analysis. IEEE Transactions on Affective Computing.
- Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
- Mipo: Mutual integration of patient journey and medical ontology for healthcare representation learning. arXiv preprint arXiv:2107.09288.
- Mnn: multimodal attentional neural networks for diagnosis prediction. Extraction, 1(2019):A1.
- Intermulti: Multi-view multimodal interactions with text-dominated hierarchical high-order fusion for emotion analysis. arXiv preprint arXiv:2212.10030.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR.
- Integrating multimodal information in large pretrained transformers. In Proceedings of the conference. Association for Computational Linguistics. Meeting, volume 2020, page 2359. NIH Public Access.
- Vergil N Slee. 1978. The international classification of diseases: ninth revision (icd-9).
- Medical concept embedding with multiple ontological representations. In IJCAI, volume 19, pages 4613–4619.
- Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the conference. Association for Computational Linguistics. Meeting, volume 2019, page 6558. NIH Public Access.
- Attention is all you need. Advances in neural information processing systems, 30.
- How to leverage the multimodal ehr data for better medical prediction? In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4029–4038.
- Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 10790–10797.
- Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:1707.07250.
- Combining structured and unstructured data for predictive models: a deep learning approach. BMC medical informatics and decision making, 20(1):1–11.
- Hierarchical attention propagation for healthcare representation learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 249–256.
- Biowordvec, improving biomedical word embeddings with subword information and mesh. Scientific data, 6(1):52.
- Contrastive learning of medical visual representations from paired images and text. In Machine Learning for Healthcare Conference, pages 2–25. PMLR.
- Heejoon Koo (4 papers)