2000 character limit reached
Representation Learning of EHR Data via Graph-Based Medical Entity Embedding (1910.02574v1)
Published 7 Oct 2019 in cs.LG, cs.IR, and stat.ML
Abstract: Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare informatics that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning low-dimensional vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec leverages diverse graph embedding techniques to cater for the unique characteristic of each medical entity. Using real-world clinical data, we demonstrate the efficacy of ME2Vec over competitive baselines on disease diagnosis prediction.