MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare (1810.09593v1)

Published 22 Oct 2018 in cs.LG, cs.CL, and stat.ML

Abstract: Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems. External resources such as medical ontologies are used to bridge the data volume constraint, but this approach is often not directly applicable or useful because of inconsistencies with terminology. To solve the data insufficiency challenge, we leverage the inherent multilevel structure of EHR data and, in particular, the encoded relationships among medical codes. We propose Multilevel Medical Embedding (MiME) which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels. We conducted two prediction tasks, heart failure prediction and sequential disease prediction, where MiME outperformed baseline methods in diverse evaluation settings. In particular, MiME consistently outperformed all baselines when predicting heart failure on datasets of different volumes, especially demonstrating the greatest performance improvement (15% relative gain in PR-AUC over the best baseline) on the smallest dataset, demonstrating its ability to effectively model the multilevel structure of EHR data.

Citations (220)

View on Semantic Scholar

Summary

The paper presents MiME, a novel model that leverages EHRs' hierarchical structure to improve predictive accuracy and address data sparsity.
The model captures complex interactions between diagnosis and treatment codes, achieving a 15% relative PR-AUC improvement in heart failure prediction on smaller datasets.
MiME’s effective operation in low-data scenarios paves the way for enhanced clinical decision-making and future integration of additional EHR layers.

MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare

The paper introduces a novel approach for leveraging the multilevel hierarchical structure of Electronic Health Records (EHRs) to enhance predictive healthcare outcomes by proposing a model named Multilevel Medical Embedding (MiME). As the volume of training data is often a limiting factor for optimizing deep learning models in the healthcare sector, MiME addresses this by effectively utilizing the inherent relationships within the multilevel structure of EHR data, specifically among medical codes.

Overview and Methodology

The MiME framework benefits from the hierarchical nature of EHRs, which encompasses multiple layers such as patient visits, diagnoses during these visits, and subsequent treatment orders. The model explicitly captures the interaction between diagnosis codes and treatment codes within patient visits, which distinguishes MiME from previous approaches that typically flatten these structures into non-hierarchical data formats.

The core methodology involves transforming the EHR data into multilevel embeddings that simultaneously address auxiliary prediction tasks. This facet assists in the efficient modeling of data sparsity and enhances the prediction of major tasks like heart failure prediction. The model's capacity to operate without reliance on external labels stems from its use of the intrinsic structural relationships within the EHR data, leveraging numerous helper functions tailored to capture various interactions at different levels of the hierarchy.

Empirical Evaluation

The authors rigorously evaluate MiME’s efficacy by conducting experiments on two primary prediction tasks: heart failure prediction and sequential disease prediction. Across these tasks, MiME demonstrates significant improvements over baseline models, especially in low-data-volume settings, underscoring its robustness and potential applicability in real-world scenarios.

A particularly notable result is MiME’s performance in heart failure prediction across datasets of varying sizes, where the model shows a 15% relative increase in PR-AUC on the smallest dataset compared to the best-performing baseline. This result underscores MiME’s adeptness in modeling EHR data hierarchically, enabling more precise predictions amidst data scarcity.

Implications and Future Work

MiME advances the field of health informatics by providing a framework that not only improves predictive accuracy but also enhances the generalization capabilities of models dealing with hierarchical data structures. The implications extend to potentially improved patient outcomes, more efficient resource allocation in healthcare systems, and better-informed clinical decision-making processes.

Future directions for MiME include extending the model to incorporate additional layers of EHR data, such as demographic data and more granular medical events. These enhancements could broaden the scope and accuracy of predictions, aligning with the ongoing advancement of AI applications in healthcare.

Conclusion

The paper contributes a significant methodological innovation in the domain of healthcare AI applications by exploiting the hierarchy and multilevel interactions within EHR data. MiME's superior performance in predictive tasks with limited data availability marks an important step towards overcoming one of the fundamental challenges faced by deep learning models in healthcare settings. This work paves the way for future research aimed at improving model robustness and precision through deeper understanding and utilization of the structural properties inherent in EHRs.

PDF Markdown