Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer (1906.04716v3)

Published 11 Jun 2019 in cs.LG and stat.ML

Abstract: Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure information. Moreover, when it comes to claims data, structure information is completely unavailable to begin with. Under such circumstances, can we still do better than just treating EHR data as a flat-structured bag-of-features? In this paper, we study the possibility of jointly learning the hidden structure of EHR while performing supervised prediction tasks on EHR data. Specifically, we discuss that Transformer is a suitable basis model to learn the hidden EHR structure, and propose Graph Convolutional Transformer, which uses data statistics to guide the structure learning process. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data.

Authors (7)

Edward Choi (90 papers)
Zhen Xu (76 papers)
Yujia Li (54 papers)
Michael W. Dusenberry (11 papers)
Gerardo Flores (22 papers)
Yuan Xue (59 papers)
Andrew M. Dai (40 papers)

Citations (222)

View on Semantic Scholar

Summary

Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer

The paper presents an innovative approach to learning the latent structure of Electronic Health Records (EHR) using a model referred to as Graph Convolutional Transformer (GCT). By leveraging the graphical nature of EHR data, the authors aim to improve the performance on a variety of prediction tasks such as readmission prediction and graph reconstruction, especially in scenarios where explicit structure information is incomplete or unavailable.

Problem Context and Motivation

EHR data are inherently structured with complex interrelations between diagnoses, treatments, and outcomes. Conventional methods often simplify this data by treating individual encounters as bags of features, which may overlook important relational information. This simplification can undermine the performance of predictive models. Previous efforts, such as the MiME model, highlighted the importance of leveraging encounter structures but assumed the availability of complete structural information, which is typically not the case in practice.

Introducing Graph Convolutional Transformer (GCT)

The GCT aims to jointly learn the hidden structure of EHR from limited structural information while performing predictive tasks. The model is built upon the Transformer architecture, renowned for its prowess in sequence modeling, and extends it by integrating graph convolution techniques to better handle the structural properties of EHR data. The authors introduce several key modifications:

Attention Masks: These masks are crafted based on domain knowledge to limit possible connections, ensuring physically implausible ones are ignored.
Prior Conditional Probabilities: These probabilities guide the self-attention mechanism, steering it towards likely connections based on statistical relationships between medical codes.
KL-Divergence Penalty: This term penalizes deviation from established probabilities, ensuring the self-attention mechanism does not stray far from medically plausible relations.

Experimental Evaluation

Experiments conducted on both synthetic and real-world data (eICU dataset) demonstrate superior performance of GCT compared to baseline models that either disregard structural information or rely entirely on data statistics. Across tasks like graph reconstruction and readmission prediction, GCT consistently outperforms models such as shallow and deep neural networks as well as traditional graph convolutional networks without the enhancements of the GCT.

Synthetic Data Evaluation: The synthetic dataset mimics the structured form of EHR, verifying GCT's ability to reconstruct the underlying graph accurately and to distinguish between subtle differences, as seen in diagnosis-treatment classification tasks.
Real-world Data Evaluation: On the eICU dataset, GCT improved prediction tasks like readmission and mortality, showcasing its practical utility in clinical settings where structural information is sparse.

Practical and Theoretical Implications

The introduction of GCT opens up new avenues for enhancing EHR data processing. Practically, the model provides healthcare research with a more nuanced toolset for predicting patient outcomes by capitalizing on the implicit structures in EHR data. Theoretically, it bridges the gap between graph-based and sequence-based modeling methods and highlights the potential of dynamic structure learning in domains beyond healthcare.

Future Directions

The paper suggests integrating GCT with sequence aggregators like RNNs for patient-level predictions, paving the way for broader applications such as chronic disease modeling and acute event prediction. Further research may focus on refining attention mechanisms to extract more specific medical insights, potentially enhancing interpretability alongside predictive performance.

In conclusion, this work signifies a substantial step forward in medical informatics, enabling more effective utilization of EHR data by recognizing and exploiting their implicit graphical structure through advanced transformer-based techniques.

PDF Markdown