Flexible Deep Neural Networks for Partially Linear Survival Data

Published 11 Dec 2025 in stat.ML and cs.LG | (2512.10570v1)

Abstract: We propose a flexible deep neural network (DNN) framework for modeling survival data within a partially linear regression structure. The approach preserves interpretability through a parametric linear component for covariates of primary interest, while a nonparametric DNN component captures complex time-covariate interactions among nuisance variables. We refer to the method as FLEXI-Haz, a flexible hazard model with a partially linear structure. In contrast to existing DNN approaches for partially linear Cox models, FLEXI-Haz does not rely on the proportional hazards assumption. We establish theoretical guarantees: the neural network component attains minimax-optimal convergence rates based on composite Holder classes, and the linear estimator is root-n consistent, asymptotically normal, and semiparametrically efficient. Extensive simulations and real-data analyses demonstrate that FLEXI-Haz provides accurate estimation of the linear effect, offering a principled and interpretable alternative to modern methods based on proportional hazards. Code for implementing FLEXI-Haz, as well as scripts for reproducing data analyses and simulations, is available at: https://github.com/AsafBanana/FLEXI-Haz

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces FLEXI-Haz, a semiparametric neural method that provides valid inference for primary covariates by modeling the hazard function without relying on proportional hazards assumptions.
It employs a dual strategy with a DNN for capturing complex, time-varying nuisance effects and a linear component for precise estimation of primary covariate effects, achieving minimax-optimal nonparametric rates.
Empirical results demonstrate negligible bias and nominal coverage rates compared to traditional Cox models, underscoring its robustness and improved performance in handling high-dimensional survival data.

Flexible Deep Neural Networks for Partially Linear Survival Data

Introduction

The discussed work, "Flexible Deep Neural Networks for Partially Linear Survival Data" (2512.10570), proposes a semiparametric neural approach (FLEXI-Haz) for time-to-event (survival) analysis, targeting inference for primary covariates while flexibly controlling for complex, time-varying effects of nuisance features. By situating the model outside the restrictive proportional hazards (PH) assumption, FLEXI-Haz generalizes previous DNN-empowered Cox models, which are constrained by baseline hazard specification and time-invariant covariate effects. The framework addresses critical theoretical and practical gaps: it preserves interpretability, guarantees valid inference for prespecified covariate effects, and achieves minimax-optimal nonparametric rates for the nuisance function estimation.

Model and Theoretical Guarantees

FLEXI-Haz specifies the hazard function as

$h(t \mid \mathbf{X}, \mathbf{Z}) = \exp\left\{\theta_o^{\top} \mathbf{Z} + g_o(t,\mathbf{X})\right\}$

where $\mathbf{Z}$ are covariates of primary interest and $\mathbf{X}$ are high-dimensional nuisance features, possibly time-dependent; $g_o$ is modeled as a DNN. This formulation transcends PH restrictions and baseline hazard estimation, providing full flexibility for temporal and covariate interactions. Importantly, FLEXI-Haz utilizes the full likelihood, not the Cox partial likelihood, introducing nontrivial analytical and computational issues due to time-dependent integrals appearing in the likelihood and the DNN approximation.

The theoretical analysis establishes:

Semiparametric Efficiency: $\hat{\theta}$ is $\sqrt{n}$ -consistent, asymptotically normal, and achieves the semiparametric efficiency bound, even in the general non-PH setting.
Minimax Rates: The DNN-approximated nuisance component $g_o$ converges at the optimal (up to logarithmic factors) rate governed by the composite H\"older class structure, matching lower bounds derived via Fano-type arguments.
Estimation Robustness: When the parametric component is omitted, the model reduces to a fully nonparametric, baseline-free neural hazard estimator, and the proof delivers the first asymptotic guarantees for this class.

The primary technical novelties lie in dealing with functions $g_o(t,x)$ (joint in time and covariates), and controlling bias induced by DNN approximation via empirical process techniques adapted for the event-driven structure of survival data.

Implementation and Estimation

Estimation is executed by maximizing a numerically approximated likelihood, with the DNN and linear components learned simultaneously. The cumulative hazard is handled via Riemann approximation over an expanded counting process dataset, leveraging efficient mini-batch learning. Critically, the linear branch responsible for primary covariate estimation is initialized using a classical Cox fit, and then fine-tuned in the joint likelihood regime for optimal behavior. The covariance estimator for inferential purposes is constructed using event-weighted residuals, with cross-fit neural projections to approximate projections onto the nuisance tangent space; this delivers consistent variance estimation for the efficient score.

From a computational perspective, the method imposes increased cost due to expansion and fine-grained integration, scaling as $O(n^2)$ with the default regime, but is compatible with sub-sampling or quadrature schemes to alleviate computational burden.

Empirical Results

Simulations target estimation and coverage properties in settings where nuisance covariates have nonlinear, time-dependent effects, and compare FLEXI-Haz to the partially linear neural Cox model [zhong2022deep]. For sample size $n=8000$ , FLEXI-Haz demonstrates negligible bias and coverage rates nominally close to theoretical targets, with empirical standard deviation aligned with estimated errors. In contrast, the PH-constrained method exhibits substantial bias in the parametric estimates and severe undercoverage in all settings with temporal effect heterogeneity, directly confirming the theoretical claims regarding the necessity of modeling time-covariate interactions. FLEXI-Haz thus achieves consistency and valid inference in scenarios that invalidate the applicability of the semiparametric Cox approach.

Implications and Future Directions

FLEXI-Haz is a practical and statistically rigorous procedure for high-dimensional, time-to-event datasets where interpretability and valid inference for a finite covariate set are required, while accounting for arbitrary, potentially nonlinear nuisance effects that may interact with time. Its theoretical construction offers a template for non-PH, interpretable survival analysis, filling a crucial gap between black-box neural approaches and restrictive parametric models.

Practically, the method enables interpretable effect estimation even in genomics, digital health, and other application domains where high-dimensional, structured nuisance features and time-varying confounding predominate. The modularity of the architecture (DNN backbone for nuisance, linear for effect estimation) aligns with broader double machine learning and orthogonalization strategies, ensuring robust inference.

Theoretically, extending minimax and efficiency analyses to convolutional and structured neural architectures (such as CNNs or transformers) is a nontrivial but vital next step for widespread applicability, as current approximation bounds do not directly transfer. Additionally, improvements in the computational tractability for large-scale datasets are required, potentially via informed subsampling, adaptive quadrature, or stochastic integral approximations tailored to survival objectives. Finally, formal asymptotics for DNN-based survival function estimation (rather than only parametric coefficients) and post-hoc inference (e.g., nonparametric bootstrap for survival curves) remain open.

Conclusion

FLEXI-Haz (2512.10570) generalizes DNN-based survival modeling to a theoretically supported, interpretable, efficiency-attaining regime without baseline or PH constraints. It tightly integrates modern learning-theoretic advances with classical survival inference, and serves as a reference model for future work in flexible, high-dimensional, time-to-event analysis with formal inferential guarantees.

Markdown Report Issue