Papers
Topics
Authors
Recent
Search
2000 character limit reached

CoxKAN Survival Analysis Model

Updated 24 December 2025
  • CoxKAN is a survival analysis model that uses Kolmogorov–Arnold Networks to parameterize the log-partial hazard function for rich, nonlinear risk modeling.
  • It employs a compact network architecture with learnable univariate functions and B-spline bases, balancing expressivity and clear symbolic interpretability.
  • Empirical evaluations show that CoxKAN outperforms classical Cox models and competes with deep neural networks by revealing biologically plausible nonlinear interactions in clinical and genomics data.

CoxKAN is a survival analysis model that leverages Kolmogorov–Arnold Networks (KANs) to provide a high-performance, interpretable alternative to traditional and deep learning-based survival models. CoxKAN directly parameterizes the log-partial hazard function of the Cox proportional hazards model using the compositional Kolmogorov–Arnold representation, allowing for rich, nonlinear modeling while maintaining explicit symbolic interpretability and performing inherent feature selection. Empirical studies show that CoxKAN consistently outperforms classical Cox proportional hazards models and is competitive with, or superior to, state-of-the-art deep neural network methods, especially in discovering complex multivariate dependencies in clinical and high-dimensional genomics data (Knottenbelt et al., 2024).

1. Mathematical Foundations and Model Definition

CoxKAN models the hazard function in the Cox proportional hazards framework as follows: hCoxKAN(tx)=h0(t)exp(θ(x))h_{\text{CoxKAN}}(t\,|\,x) = h_0(t) \exp( \theta(x) ) where h0(t)h_0(t) is the baseline hazard and θ(x)=KAN(x)\theta(x) = \text{KAN}(x) is a real-valued function learned by a Kolmogorov–Arnold Network, mapping covariates xRDx \in \mathbb{R}^D to a log-risk score. The baseline hazard is unspecified and handled via the partial likelihood framework inherent to the Cox model; the learning focuses wholly on the nonparametric risk function θ(x)\theta(x).

The Kolmogorov–Arnold representation used in CoxKAN is: θ(x)=q=1Mϕq(p=1Dψq,p(xp))\theta(x) = \sum_{q=1}^M \phi_q\left( \sum_{p=1}^D \psi_{q,p}(x_p) \right) where the ψq,p:RR\psi_{q,p} : \mathbb{R} \to \mathbb{R} are univariate “inner” functions (one per feature per neuron) and the ϕq:RR\phi_q : \mathbb{R} \to \mathbb{R} are univariate “outer” functions. The summation structure ensures universal approximation capability for continuous multivariate functions as per the Kolmogorov–Arnold theorem (Knottenbelt et al., 2024).

2. Network Architecture and Parameterization

CoxKAN typically employs a compact architecture (often one or two hidden layers) where each connection is a learnable univariate function rather than a scalar weight. The canonical (single hidden layer) architecture is:

  • Inputs: DD features
  • Hidden: MM units, each receiving all DD features via M×DM \times D parallel univariate inner functions ψq,p\psi_{q,p}
  • Output: θ(x)\theta(x) as a sum over MM outer univariate functions ϕq\phi_q

Each univariate function (inner or outer) is parameterized by a small B-spline basis plus an optional residual basis term: φ(x)=wbb(x)+wsi=0G+k1ciBi,k(x)\varphi(x) = w_b\,b(x) + w_s \sum_{i=0}^{G+k-1} c_i\,B_{i,k}(x) where Bi,kB_{i,k} are degree-kk B-spline basis functions, cic_i are trainable coefficients, and wbw_b, wsw_s are trainable scalars. By using low-order splines (typically k=3k=3) on a small grid (G=3G=3–5), the architecture achieves a balance between expressivity and interpretability (Knottenbelt et al., 2024).

3. Training Objective, Regularization, and Feature Selection

CoxKAN is trained to minimize the regularized negative partial Cox log-likelihood: total=Cox+λR\ell_{\rm total} = \ell_{\rm Cox} + \lambda R where

Cox=i:δi=1[KAN(xi)logjR(ti)exp(KAN(xj))]\ell_{\rm Cox} = -\sum_{i:\, \delta_i=1} \left[ \text{KAN}(x_i) - \log \sum_{j \in \mathcal{R}(t_i)} \exp( \text{KAN}(x_j) ) \right]

and R(ti)={j:tjti}\mathcal{R}(t_i) = \{ j : t_j \geq t_i \} is the risk set for event time tit_i.

The regularizer RR combines:

  • 1\ell_1-norm of activation magnitudes (encourages edge/neuron sparsity)
  • Entropy of activation magnitudes (promotes focused sparse connectivity)
  • 1\ell_1-norm on spline coefficients (encourages function simplicity)

Optimization is performed using the Adam algorithm with early stopping based on validation concordance index (C-Index). After training, a threshold parameter τ\tau prunes low-activation edges and neurons, resulting in automatic feature selection and topology simplification (Knottenbelt et al., 2024).

4. Symbolic Formula Extraction and Interpretability

After pruning, each univariate function φ(x)\varphi(x) is fitted to a small symbolic template: φ^(x)=cf(ax+b)+d,f{sin,exp,tanh,arctan,xn,}\hat{\varphi}(x) = c\,f(ax+b) + d, \quad f \in \{ \sin, \exp, \tanh, \arctan, x^n, \ldots \} The best-fitting template is chosen by maximizing R2R^2 over empirical activations. If no template matches (R2<0.99R^2 < 0.99), symbolic regression tools such as PySR are invoked to recover a closed-form expression.

The final risk score θ(x)\theta(x) is the sum of these explicitly discovered symbolic curves. This approach provides direct insight into both the overall hazard model and the effect of individual covariates or interactions, differentiating CoxKAN from “black-box” neural competitors (Knottenbelt et al., 2024).

5. Empirical Evaluation and Benchmarking

CoxKAN was evaluated on four synthetic datasets (where ground-truth hazard formulas were known) and nine real-world datasets (comprising five standard clinical and four high-dimensional genomics cohorts). Performance was measured by the Harrell C-Index and, when applicable, the Integrated Brier Score.

Dataset Type Comparator Models CoxKAN Performance
Synthetic CoxPH, DeepSurv Matches/exceeds true hazard in 3/4 cases
Clinical CoxPH, DeepSurv Outperforms CoxPH, matches/exceeds DeepSurv on 4/5
Genomics (TCGA) CoxPH+Lasso, DeepSurv Competitive with CoxPH+Lasso; beats DeepSurv on 2/4

On synthetic benchmarks, CoxKAN exactly recovered the generating hazard function when expressible by the model. On clinical datasets, CoxKAN symbolic models achieved higher or comparable C-Index versus CoxPH and DeepSurv, with non-overlapping confidence intervals in several cases. In high-dimensional genomics, CoxKAN remained robust where unregularized CoxPH failed due to multicollinearity (Knottenbelt et al., 2024).

6. Discovery of Nonlinear Interactions and Biological Plausibility

CoxKAN demonstrated a unique capacity to discover and symbolize previously unrecognized nonlinear and interaction effects among covariates. For instance, in the SUPPORT dataset, the learned interaction subnetworks between age and metastatic cancer status revealed biologically plausible, cohort-specific risk trajectories. In the GBSG breast cancer dataset, CoxKAN rediscovered nonlinear “sweet-spot” biomarker effects, and in high-dimensional glioma genomics data, it uncovered clear genetic prognostic signatures matching known molecular pathology (Knottenbelt et al., 2024).

7. Practical Implementation and Usage Workflow

CoxKAN’s practical usage involves:

  1. Selecting the KAN architecture and regularization strength.
  2. Training the network using the regularized Cox partial-likelihood with early stopping.
  3. Pruning low-activation edges to yield a minimal feature set.
  4. Running symbolic fitting or symbolic regression on the remaining activations to produce a final, human-readable hazard model.

This enables practitioners to derive a sparse, accurate, and interpretable survival model that aligns with regulatory and scientific requirements for transparency in biomedical applications (Knottenbelt et al., 2024).


This summary synthesizes results and methodologies as presented in "CoxKAN: Kolmogorov-Arnold Networks for Interpretable, High-Performance Survival Analysis" (Knottenbelt et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CoxKAN.