CURENet: Multimodal Chronic Risk Prediction

Updated 21 November 2025

CURENet is a multimodal predictive architecture that integrates unstructured clinical notes, lab test results, and temporal visit data for chronic disease risk prediction.
It employs a fine-tuned large language model and a transformer-based encoder to capture intricate cross-modal relationships, achieving over 94% accuracy in multi-label classification.
Evaluation on MIMIC-III and FEMH datasets demonstrates its superior performance over baselines while highlighting the need for effective multimodal EHR integration.

CURENet is a multimodal predictive architecture designed to efficiently model chronic disease risk using electronic health records (EHRs) that encompass unstructured clinical notes, lab test results, and the time-series patterns of patient visits. By leveraging advances in LLMs alongside transformer encoders specialized for temporal healthcare data, CURENet captures intricate cross-modal relationships crucial for high-fidelity risk estimation in clinical contexts. Evaluated on both public (MIMIC-III) and private (FEMH) datasets, it achieves over 94% accuracy for multi-label prediction of the top 10 chronic conditions, robustly outperforming existing baselines including general and domain-adapted LLMs (Dao et al., 14 Nov 2025).

1. Architectural Framework

CURENet’s architecture comprises a dual-stream patient representation extractor. Unstructured textual data—including concatenated clinical notes and lab test summaries—are processed via a fine-tuned LLM (Medical-LLaMA3-8B). Concurrently, longitudinal visit signals (duration and inter-visit gap) are encoded through a transformer-based time-series visit encoder. The semantic embedding $z_a \in \mathbb{R}^d$ from the LLM and the temporal embedding $z_b \in \mathbb{R}^d$ from the transformer are concatenated and passed through a multilayer perceptron (MLP) to yield a unified patient representation $z$ . Disease probabilities $\hat{y}$ are finally produced via a sigmoid-activated linear layer in a multi-label inference framework.

2. Clinical Text and Lab Test Encoding

Clinical Note Processing

Raw EHR text—including Chief Complaint, History of Present Illness, Medical History, and Admission Medications—is concatenated with the templated output of structured lab tests. The sequence $x_{\text{text}}$ is tokenized and fed into the Medical-LLaMA3-8B, which is fine-tuned using 4-bit NF4 quantization and LoRA adapters on medical Q&A corpora. The output hidden state $h_a \in \mathbb{R}^{d_{\text{LLM}}}$ is linearly projected to the disease logit space:

$z_a = W_{\text{LLM}} h_a + b_{\text{LLM}}, \quad W_{\text{LLM}} \in \mathbb{R}^{d \times d_{\text{LLM}}}$

Lab Test Representation

Structured lab results are converted to a standardized text template (e.g., “ITEMID<<ITEMID>>: <<VALUE>><<VALUEUOM>>; ...”). Abnormal values are explicitly flagged. The lab test text is concatenated with clinical notes and embedded using the same LLM pipeline, ensuring modality alignment. If desired, lab embeddings can be written as $h^{lab}_i = g_{\text{lab}}(x^{lab}_i; W_{lab})$ , but CURENet utilizes a unified encoder and projection.

3. Temporal Signal Encoding

Each patient’s time-series data consists of visit-level signals: visit duration $(d_t - a_t)$ and inter-visit gap $(a_t - d_{t-1})$ . These are combined as $x_t \in \mathbb{R}^2$ , min–max scaled, and mapped linearly into embeddings. Positional encodings are added, after which the sequence is input to a three-layer encoder-only transformer (8 attention heads per layer, model dimension $d=64$ , FFN inner dimension 256) employing batch normalization. After the transformer, a global representation $\tilde{z}$ is computed (mean or CLS token) and projected:

$z_b = W_o \tilde{z} + b_o, \quad W_o \in \mathbb{R}^{d \times d}$

The transformer self-attention mechanism follows $\text{Attention}(Q, K, V) = \mathrm{softmax}(Q K^{\top}/\sqrt{d_k}) V$ , and outputs are aggregated via feedforward blocks with non-linear activation.

4. Multimodal Fusion and Prediction

The patient representation is the concatenation $z_{\text{in}} = [z_a \| z_b] \in \mathbb{R}^{2d}$ , fused via a two-layer MLP with ReLU activations into $z \in \mathbb{R}^d$ . This is used as input to a sigmoid layer producing disease probabilities for each of $K=10$ chronic conditions:

$\hat{y}_i^{(k)} = \sigma(w_k^\top z + b_k), \quad k = 1,\dots, 10$

The loss function is a convex combination of binary cross-entropy and multilabel hinge ranking:

$\mathcal{L} = \alpha \mathcal{L}_{\mathrm{BCE}} + (1 - \alpha)\mathcal{L}_{\mathrm{hinge}}, \quad \alpha = 0.95$

where

$\mathcal{L}_{\mathrm{BCE}} = -\frac{1}{N} \sum_{i=1}^N \sum_{k=1}^K \left[y_i^{(k)} \log \hat{y}_i^{(k)} + (1 - y_i^{(k)}) \log (1 - \hat{y}_i^{(k)})\right]$

and

$\mathcal{L}_{\mathrm{hinge}} = \frac{1}{N} \sum_{i=1}^N \sum_{p: y_i^{(p)} = 1,\ q: y_i^{(q)} = 0} \max\left(0, 1 - (\hat{y}_i^{(p)} - \hat{y}_i^{(q)})\right)$

5. Experimental Setup and Performance Metrics

CURENet is evaluated on the MIMIC-III (public) and FEMH (private Taiwan hospital) datasets, with patients selected for having at least two visits. An 80:20 train/test split at the patient level is used to avoid leakage. All clinical notes are de-identified; lab values standardized as described; time-series features padded to a maximum length of 16.

Metrics employed are: accuracy, precision ( $P = \frac{TP}{TP + FP}$ ), recall ( $R = \frac{TP}{TP + FN}$ ), macro-F1 (mean across classes), weighted F1 (support-weighted), Recall@ $k$ , and NDCG@ $k$ for top- $k$ ranking. Recall@ $k$ is defined as

$\mathrm{Recall@}k = \frac{1}{|\mathcal{P}|} \sum_{i \in \mathcal{P}} \frac{|D_i \cap \hat{D}_i^{(k)}|}{|D_i|}$

and NDCG@ $k$ follows standard DCG/IDCG conventions.

6. Result Highlights and Ablation Analysis

CURENet demonstrates consistent superiority over previously established baselines, including BERT, Llama-2, Mistral, and LoRA-adapted Medical-LLaMA3-8B, across all key multi-label metrics.

Dataset	Accuracy Gain	F1 Macro Gain	Notable Ablation Impact
MIMIC-III	+1.5 pp	+1.8 pp	- Clinical notes: -19.2 pp F1 macro, -10.8 pp accuracy
FEMH	+1.7 pp	+2.0 pp	- Lab-text: -4.3 pp F1 macro, -5.0 pp accuracy

Omitting clinical text or lab-text input delivers significant performance drops (e.g., F1 macro from 85.5% to 66.3% and accuracy from 91.7% to 80.9% on MIMIC-III), indicating the necessity of cross-modal EHR integration. In heart-failure prediction, similar losses underscore generalizability claims. Systematic gains are observed in Recall@ $k$ and NDCG@ $k$ curves for $k = 1$ to $5$.

7. Discussion, Implications, and Limitations

CURENet’s design enables deep cross-modal fusion between semantic content (via LLM) and irregularly sampled temporal signals (via transformer encoder), avoiding the pitfalls of naive concatenation. Explicit modeling of visit duration and gap is judged critical for temporal patterns of chronic disease trajectories. The hybrid loss function improves both per-label calibration and margin-based disease ranking.

Limitations include the use of data from only two hospital systems, lack of imaging or continuous vital sign modalities, and the need for further prospective validation. Real-world deployment is contingent on the development of privacy-preserving techniques and richer model explainability (such as attention heatmaps). Potential future directions include the addition of concept-bottleneck modules, integration of imaging/genomic modalities, and exploration of federated or differential-privacy based training, which may extend the model’s applicability and trust in clinical settings. This suggests CURENet serves as a robust framework for multimodal EHR modeling in chronic disease risk, with promising avenues for further research and validation (Dao et al., 14 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

CURENet: Combining Unified Representations for Efficient Chronic Disease Prediction (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to CURENet.