Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MD-Manifold: A Medical-Distance-Based Representation Learning Approach for Medical Concept and Patient Representation (2305.00553v1)

Published 30 Apr 2023 in cs.LG

Abstract: Effectively representing medical concepts and patients is important for healthcare analytical applications. Representing medical concepts for healthcare analytical tasks requires incorporating medical domain knowledge and prior information from patient description data. Current methods, such as feature engineering and mapping medical concepts to standardized terminologies, have limitations in capturing the dynamic patterns from patient description data. Other embedding-based methods have difficulties in incorporating important medical domain knowledge and often require a large amount of training data, which may not be feasible for most healthcare systems. Our proposed framework, MD-Manifold, introduces a novel approach to medical concept and patient representation. It includes a new data augmentation approach, concept distance metric, and patient-patient network to incorporate crucial medical domain knowledge and prior data information. It then adapts manifold learning methods to generate medical concept-level representations that accurately reflect medical knowledge and patient-level representations that clearly identify heterogeneous patient cohorts. MD-Manifold also outperforms other state-of-the-art techniques in various downstream healthcare analytical tasks. Our work has significant implications in information systems research in representation learning, knowledge-driven machine learning, and using design science as middle-ground frameworks for downstream explorative and predictive analyses. Practically, MD-Manifold has the potential to create effective and generalizable representations of medical concepts and patients by incorporating medical domain knowledge and prior data information. It enables deeper insights into medical data and facilitates the development of new analytical applications for better healthcare outcomes.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Shaodong Wang (9 papers)
  2. Qing Li (430 papers)
  3. Wenli Zhang (9 papers)