TCKAN:A Novel Integrated Network Model for Predicting Mortality Risk in Sepsis Patients (2407.06560v2)

Published 9 Jul 2024 in stat.AP and cs.AI

Abstract: Sepsis poses a major global health threat, accounting for millions of deaths annually and significant economic costs. Accurately predicting the risk of mortality in sepsis patients enables early identification, promotes the efficient allocation of medical resources, and facilitates timely interventions, thereby improving patient outcomes. Current methods typically utilize only one type of data--either constant, temporal, or ICD codes. This study introduces a novel approach, the Time-Constant Kolmogorov-Arnold Network (TCKAN), which uniquely integrates temporal data, constant data, and ICD codes within a single predictive model. Unlike existing methods that typically rely on one type of data, TCKAN leverages a multi-modal data integration strategy, resulting in superior predictive accuracy and robustness in identifying high-risk sepsis patients. Validated against the MIMIC-III and MIMIC-IV datasets, TCKAN surpasses existing machine learning and deep learning methods in accuracy, sensitivity, and specificity. Notably, TCKAN achieved AUCs of 87.76% and 88.07%, demonstrating superior capability in identifying high-risk patients. Additionally, TCKAN effectively combats the prevalent issue of data imbalance in clinical settings, improving the detection of patients at elevated risk of mortality and facilitating timely interventions. These results confirm the model's effectiveness and its potential to transform patient management and treatment optimization in clinical practice. Although the TCKAN model has already incorporated temporal, constant, and ICD code data, future research could include more diverse medical data types, such as imaging and laboratory test results, to achieve a more comprehensive data integration and further improve predictive accuracy.

PDF HTML Abstract

The paper introduces the Time-Constant KAN Integrated Network (TCKIN), a novel model designed to predict mortality risk in sepsis patients within Intensive Care Units (ICU). The TCKIN model integrates temporal data, constant data, and diagnostic International Classification of Diseases (ICD) codes to enhance the accuracy of sepsis mortality risk predictions. The paper validates the model against the Medical Information Mart for Intensive Care III (MIMIC-III) and Medical Information Mart for Intensive Care IV (MIMIC-IV) datasets, demonstrating superior performance compared to existing machine learning and deep learning methods in terms of accuracy, sensitivity, and specificity.

The central contributions of the paper are:

Integration of diagnostic ICD codes with the Clinical Classifications Software (CCS) medical ontology using graph networks to leverage coding systems fully.
Fusion of temporal data, handled via the Gated Recurrent Unit with Decay (GRU-D), with constant data, analyzed by the KolmogorovâArnold Networks (KAN), to improve predictive capabilities.
Demonstration of superior performance in prediction accuracy and robustness through the integration of multiple data sources and processing techniques.

The paper uses the MIMIC-III and MIMIC-IV datasets, which include detailed health records from ICUs at Beth Israel Deaconess Medical Center. The data preprocessing involves selecting sepsis patients based on the Sepsis-3 definition, excluding those under 16 years of age, those with corrupted data, and those with ICU stays shorter than 24 hours. Constant data, including demographic information, and temporal data, encompassing physiological signs and laboratory test results, along with diagnostic ICD coding information were extracted. The ICD codes were converted into CCS code sequences to simplify complex diagnostic information.

The TCKIN model architecture comprises three primary components:

A GRU-D model processes temporal data to generate hidden representations, which is formulated as:

$D = (D_1, D_2, \dots, D_t, \dots, D_T) \in \mathbb{R}^{T \times N}$

* $D$ is the temporal data * $D_t$ is the temporal data at time $t$ * $T$ is the time interval * $N$ is the number of features

The GRU-D network manages missing values using a mask $I$ and time intervals $\Delta$ . The decay mechanism adjusts imputation values based on these intervals, as shown in the following equations:

$\gamma_t = \exp\left(-\max(0, W_{\Delta} \Delta_t + b_{\Delta})\right)$

* $\gamma_t$ is the decay factor * $W_{\Delta}$ is the weight matrix * $\Delta_t$ is the time interval * $b_{\Delta}$ is the bias term

$Z_t = \sigma\left(W_z (D_t \odot I_t) + U_z (\gamma_t \odot h_{t-1}) + b_z\right)$

* $Z_t$ is the input gate * $\sigma$ is the sigmoid activation function * $W_z$ is the weight matrix * $D_t$ is the input feature vector at time step $t$ * $I_t$ is the mask * $U_z$ is the weight matrix * $\gamma_t$ is the decay factor * $h_{t-1}$ is the hidden state at time $t-1$ * $b_z$ is the bias term

An attention mechanism analyzes ICD diagnostic codes and CCS codes to capture complex relationships and semantic information. The similarity between codes is calculated to determine their relative importance, and attention weights are assigned accordingly:

$h_{\text{icd} = \text{ScaleDotProductAttention}(Q, K, V) = \text{softmax}\left(\frac{Q K^T}{\sqrt{d^k}\right) V$

* $h_{\text{icd}}$ is the ICD hidden state * $Q$ is the query * $K$ is the key * $V$ is the value * $d^k$ is the dimensionality of the keys

A KAN network processes constant data to extract high-level features. The KAN network uses learnable B-spline activation functions at the edges, enhancing flexibility and adaptability. Each layer in KAN is expressed as a function matrix:

$x_{l+1,j} = \sum_{i=1}^{n_l} \tilde{x}_{l,j,i} = \sum_{i=1}^{n_l} \phi_{l,j,i}(x_{l,i})$

* $x_{l+1,j}$ is the activation value at node $(l+1, j)$ * $\tilde{x}_{l,j,i}$ represents the post-activation value from input node $i$ to output node $j$ in layer $l$ * $\phi_{l,j,i}$ includes trainable parameters * $x_{l,i}$ is the pre-activation value from input node $i$ to output node $j$ in layer $l$ * $n_l$ is the number of nodes in layer $l$

The operational flow within the network involves pre-activation, post-activation, and node activation. The hidden features from the three components are concatenated and processed through a final KAN network to predict sepsis mortality risk.

The model was implemented and trained using TensorFlow, with the Adam optimizer and a learning rate decay strategy. Oversampling techniques were used to address the imbalance between positive and negative samples. Five-fold cross-validation was adopted to validate the model's stability and generalizability. Performance metrics included sensitivity, specificity, area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and the Brier Score (BS).

The TCKIN model was compared with seven established baseline models, including Xgboost, SVM, Random Forest, LGBM, LSTM, IseeU, and GRU-D. On the MIMIC-IV dataset, the TCKIN model achieved AUROC and AUPRC values of 0.8807 and 0.5470, respectively. The paper also conducted parameter sensitivity experiments focusing on the learning rate and batch size. Ablation experiments were performed by removing or replacing key modules to assess their impact on overall model performance. Replacing GRU-D with a standard GRU module led to a decline in AUROC value from 0.8807 to 0.8547, and replacing the KAN modules with multilayer perceptrons (MLP) led to a reduction in AUROC value from 0.8807 to 0.8693.

The paper identifies key temporal features, including pH value, alanine aminotransferase, red blood cell count, and monocyte count, and constant features, including age, race, weight, and type of admission, as significant influencers of the model’s predictive accuracy. Certain ICD codes related to severe conditions like diabetes and malignancies are also strongly associated with increased mortality risk.

The paper notes limitations, including the datasets originating from a single medical center and opportunities for enhancement in processing specific types of patient data. Future research should consider including a broader array of features, such as genetic markers and imaging data.

PDF Markdown Bookmark Chat (Pro)

Authors (1)

Fanglin Dong (1 paper)

Citations (2)

View on Semantic Scholar

TCKAN:A Novel Integrated Network Model for Predicting Mortality Risk in Sepsis Patients (2407.06560v2)

Related Papers