Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 43 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

ICU-TSB: A Benchmark for Temporal Patient Representation Learning for Unsupervised Stratification into Patient Cohorts (2506.06192v1)

Published 6 Jun 2025 in cs.LG

Abstract: Patient stratification identifying clinically meaningful subgroups is essential for advancing personalized medicine through improved diagnostics and treatment strategies. Electronic health records (EHRs), particularly those from intensive care units (ICUs), contain rich temporal clinical data that can be leveraged for this purpose. In this work, we introduce ICU-TSB (Temporal Stratification Benchmark), the first comprehensive benchmark for evaluating patient stratification based on temporal patient representation learning using three publicly available ICU EHR datasets. A key contribution of our benchmark is a novel hierarchical evaluation framework utilizing disease taxonomies to measure the alignment of discovered clusters with clinically validated disease groupings. In our experiments with ICU-TSB, we compared statistical methods and several recurrent neural networks, including LSTM and GRU, for their ability to generate effective patient representations for subsequent clustering of patient trajectories. Our results demonstrate that temporal representation learning can rediscover clinically meaningful patient cohorts; nevertheless, it remains a challenging task, with v-measuring varying from up to 0.46 at the top level of the taxonomy to up to 0.40 at the lowest level. To further enhance the practical utility of our findings, we also evaluate multiple strategies for assigning interpretable labels to the identified clusters. The experiments and benchmark are fully reproducible and available at https://github.com/ds4dh/CBMS2025stratification.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces an unsupervised evaluation framework that leverages temporal patient representation learning from electronic health records.
  • The study demonstrates that recurrent neural networks, especially LSTMs, effectively capture non-linear temporal patterns compared to statistical methods.
  • The research shows that using majority vote in clustering aligns patient cohorts with diagnostic categories, paving the way for improved stratification in ICU settings.

ICU-TSB: A Benchmark for Temporal Patient Representation Learning

The paper "ICU-TSB: A Benchmark for Temporal Patient Representation Learning for Unsupervised Stratification into Patient Cohorts" introduces a novel evaluation framework designed specifically for patient stratification in intensive care units (ICU), leveraging electronic health records (EHRs). The authors present the ICU-TSB benchmark, a comprehensive method for assessing temporal patient representation models that stratify patients into clinically relevant cohorts without the need for labeled data.

Background and Objectives

Patient stratification—segmentation of patient populations into subgroups with similar clinical profiles—is critical for advancing personalized medicine. The abundance of temporal data contained within EHRs provides a ripe opportunity for machine learning models to identify these subgroups. Prior work on such stratification often relies on supervised approaches, which necessitate labeled datasets. However, labeled data are not always available, limiting the application of these methods to broader populations.

The ICU-TSB benchmark sets out to evaluate unsupervised stratification specifically, utilizing representation learning techniques to derive patient representations from raw EHR data. It incorporates three open-source ICU datasets with complex hierarchical taxonomies to better explore unsupervised methods.

Methodology

The authors utilize three ICU datasets: MIMIC-IV, eICU, and SiC, which include temporal features like vitals, demographics, and clinical assessments. Data preprocessing was achieved via normalization techniques and other standard processes to ensure uniformity across datasets. The authors explored patient representation learning both through statistical methods and deep learning architectures—specifically, LSTMs and GRUs.

These baselines were tasked with generating embeddings which were then used in clustering algorithms to rediscover disease categories according to ICD and CCS taxonomies. The process involved hierarchical rediscovery across four levels, with clusters and labels evaluated against existing medical classifications.

Experimental Results

A key finding in the paper is that temporal representation achieved through recurrent neural networks, particularly LSTMs, outperforms statistical methods in many scenarios, highlighting their efficacy in capturing non-linear temporal patterns inherent in EHR data. Despite this, unsupervised learning posed significant challenges, as demonstrated by v-measure scores reaching 0.46 at the top level but declining to 0.40 at the lowest level in taxonomy rediscovery.

Further analysis involved multiple label assignment strategies—centroid, medoid, and majority vote—with majority voting outperforming others, suggesting that patient cohorts naturally align with the most common diagnostic categories. The authors also provide insights into improvement avenues for more granulated stratifications and potential hybrid methods leveraging these frameworks.

Implications and Future Directions

The work presented in this paper has significant theoretical implications for unsupervised learning in medicine, demonstrating the capacity of temporal representation models to derive clinically pertinent patient subgroups from unlabeled data. Practically, this can lead to more effective stratification approaches in ICU settings, with potential benefits such as optimized resource allocation and improved patient outcomes.

For future research, the authors propose the investigation of transformer-based models and contrastive learning to better capture intricate clinical sequences and semi-supervised learning methods to leverage partially labeled data. Extending this framework to integrate multimodal data—such as genomics and imaging—could revolutionize the granularity and accuracy with which subgroups are identified.

The ICU-TSB benchmark sets a foundation for scalable, interpretable patient stratification and encourages ongoing research into time-series representation learning methodologies, necessitating continual refinement and validation in clinical settings.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com