Papers
Topics
Authors
Recent
2000 character limit reached

CURENet: Multimodal Chronic Risk Prediction

Updated 21 November 2025
  • CURENet is a multimodal predictive architecture that integrates unstructured clinical notes, lab test results, and temporal visit data for chronic disease risk prediction.
  • It employs a fine-tuned large language model and a transformer-based encoder to capture intricate cross-modal relationships, achieving over 94% accuracy in multi-label classification.
  • Evaluation on MIMIC-III and FEMH datasets demonstrates its superior performance over baselines while highlighting the need for effective multimodal EHR integration.

CURENet is a multimodal predictive architecture designed to efficiently model chronic disease risk using electronic health records (EHRs) that encompass unstructured clinical notes, lab test results, and the time-series patterns of patient visits. By leveraging advances in LLMs alongside transformer encoders specialized for temporal healthcare data, CURENet captures intricate cross-modal relationships crucial for high-fidelity risk estimation in clinical contexts. Evaluated on both public (MIMIC-III) and private (FEMH) datasets, it achieves over 94% accuracy for multi-label prediction of the top 10 chronic conditions, robustly outperforming existing baselines including general and domain-adapted LLMs (Dao et al., 14 Nov 2025).

1. Architectural Framework

CURENet’s architecture comprises a dual-stream patient representation extractor. Unstructured textual data—including concatenated clinical notes and lab test summaries—are processed via a fine-tuned LLM (Medical-LLaMA3-8B). Concurrently, longitudinal visit signals (duration and inter-visit gap) are encoded through a transformer-based time-series visit encoder. The semantic embedding zaRdz_a \in \mathbb{R}^d from the LLM and the temporal embedding zbRdz_b \in \mathbb{R}^d from the transformer are concatenated and passed through a multilayer perceptron (MLP) to yield a unified patient representation zz. Disease probabilities y^\hat{y} are finally produced via a sigmoid-activated linear layer in a multi-label inference framework.

2. Clinical Text and Lab Test Encoding

Clinical Note Processing

Raw EHR text—including Chief Complaint, History of Present Illness, Medical History, and Admission Medications—is concatenated with the templated output of structured lab tests. The sequence xtextx_{\text{text}} is tokenized and fed into the Medical-LLaMA3-8B, which is fine-tuned using 4-bit NF4 quantization and LoRA adapters on medical Q&A corpora. The output hidden state haRdLLMh_a \in \mathbb{R}^{d_{\text{LLM}}} is linearly projected to the disease logit space:

za=WLLMha+bLLM,WLLMRd×dLLMz_a = W_{\text{LLM}} h_a + b_{\text{LLM}}, \quad W_{\text{LLM}} \in \mathbb{R}^{d \times d_{\text{LLM}}}

Lab Test Representation

Structured lab results are converted to a standardized text template (e.g., “ITEMID<<ITEMID>>: <<VALUE>><<VALUEUOM>>; ...”). Abnormal values are explicitly flagged. The lab test text is concatenated with clinical notes and embedded using the same LLM pipeline, ensuring modality alignment. If desired, lab embeddings can be written as hilab=glab(xilab;Wlab)h^{lab}_i = g_{\text{lab}}(x^{lab}_i; W_{lab}), but CURENet utilizes a unified encoder and projection.

3. Temporal Signal Encoding

Each patient’s time-series data consists of visit-level signals: visit duration (dtat)(d_t - a_t) and inter-visit gap (atdt1)(a_t - d_{t-1}). These are combined as xtR2x_t \in \mathbb{R}^2, min–max scaled, and mapped linearly into embeddings. Positional encodings are added, after which the sequence is input to a three-layer encoder-only transformer (8 attention heads per layer, model dimension d=64d=64, FFN inner dimension 256) employing batch normalization. After the transformer, a global representation z~\tilde{z} is computed (mean or CLS token) and projected:

zb=Woz~+bo,WoRd×dz_b = W_o \tilde{z} + b_o, \quad W_o \in \mathbb{R}^{d \times d}

The transformer self-attention mechanism follows Attention(Q,K,V)=softmax(QK/dk)V\text{Attention}(Q, K, V) = \mathrm{softmax}(Q K^{\top}/\sqrt{d_k}) V, and outputs are aggregated via feedforward blocks with non-linear activation.

4. Multimodal Fusion and Prediction

The patient representation is the concatenation zin=[zazb]R2dz_{\text{in}} = [z_a \| z_b] \in \mathbb{R}^{2d}, fused via a two-layer MLP with ReLU activations into zRdz \in \mathbb{R}^d. This is used as input to a sigmoid layer producing disease probabilities for each of K=10K=10 chronic conditions:

y^i(k)=σ(wkz+bk),k=1,,10\hat{y}_i^{(k)} = \sigma(w_k^\top z + b_k), \quad k = 1,\dots, 10

The loss function is a convex combination of binary cross-entropy and multilabel hinge ranking:

L=αLBCE+(1α)Lhinge,α=0.95\mathcal{L} = \alpha \mathcal{L}_{\mathrm{BCE}} + (1 - \alpha)\mathcal{L}_{\mathrm{hinge}}, \quad \alpha = 0.95

where

LBCE=1Ni=1Nk=1K[yi(k)logy^i(k)+(1yi(k))log(1y^i(k))]\mathcal{L}_{\mathrm{BCE}} = -\frac{1}{N} \sum_{i=1}^N \sum_{k=1}^K \left[y_i^{(k)} \log \hat{y}_i^{(k)} + (1 - y_i^{(k)}) \log (1 - \hat{y}_i^{(k)})\right]

and

Lhinge=1Ni=1Np:yi(p)=1, q:yi(q)=0max(0,1(y^i(p)y^i(q)))\mathcal{L}_{\mathrm{hinge}} = \frac{1}{N} \sum_{i=1}^N \sum_{p: y_i^{(p)} = 1,\ q: y_i^{(q)} = 0} \max\left(0, 1 - (\hat{y}_i^{(p)} - \hat{y}_i^{(q)})\right)

5. Experimental Setup and Performance Metrics

CURENet is evaluated on the MIMIC-III (public) and FEMH (private Taiwan hospital) datasets, with patients selected for having at least two visits. An 80:20 train/test split at the patient level is used to avoid leakage. All clinical notes are de-identified; lab values standardized as described; time-series features padded to a maximum length of 16.

Metrics employed are: accuracy, precision (P=TPTP+FPP = \frac{TP}{TP + FP}), recall (R=TPTP+FNR = \frac{TP}{TP + FN}), macro-F1 (mean across classes), weighted F1 (support-weighted), Recall@kk, and NDCG@kk for top-kk ranking. Recall@kk is defined as

Recall@k=1PiPDiD^i(k)Di\mathrm{Recall@}k = \frac{1}{|\mathcal{P}|} \sum_{i \in \mathcal{P}} \frac{|D_i \cap \hat{D}_i^{(k)}|}{|D_i|}

and NDCG@kk follows standard DCG/IDCG conventions.

6. Result Highlights and Ablation Analysis

CURENet demonstrates consistent superiority over previously established baselines, including BERT, Llama-2, Mistral, and LoRA-adapted Medical-LLaMA3-8B, across all key multi-label metrics.

Dataset Accuracy Gain F1 Macro Gain Notable Ablation Impact
MIMIC-III +1.5 pp +1.8 pp - Clinical notes: -19.2 pp F1 macro, -10.8 pp accuracy
FEMH +1.7 pp +2.0 pp - Lab-text: -4.3 pp F1 macro, -5.0 pp accuracy

Omitting clinical text or lab-text input delivers significant performance drops (e.g., F1 macro from 85.5% to 66.3% and accuracy from 91.7% to 80.9% on MIMIC-III), indicating the necessity of cross-modal EHR integration. In heart-failure prediction, similar losses underscore generalizability claims. Systematic gains are observed in Recall@kk and NDCG@kk curves for k=1k = 1 to $5$.

7. Discussion, Implications, and Limitations

CURENet’s design enables deep cross-modal fusion between semantic content (via LLM) and irregularly sampled temporal signals (via transformer encoder), avoiding the pitfalls of naive concatenation. Explicit modeling of visit duration and gap is judged critical for temporal patterns of chronic disease trajectories. The hybrid loss function improves both per-label calibration and margin-based disease ranking.

Limitations include the use of data from only two hospital systems, lack of imaging or continuous vital sign modalities, and the need for further prospective validation. Real-world deployment is contingent on the development of privacy-preserving techniques and richer model explainability (such as attention heatmaps). Potential future directions include the addition of concept-bottleneck modules, integration of imaging/genomic modalities, and exploration of federated or differential-privacy based training, which may extend the model’s applicability and trust in clinical settings. This suggests CURENet serves as a robust framework for multimodal EHR modeling in chronic disease risk, with promising avenues for further research and validation (Dao et al., 14 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to CURENet.