COALA-PG Algorithm for Bayesian Cox Models

Updated 22 October 2025

The paper introduces a flexible Bayesian inference method that effectively handles monotonic baseline hazard constraints in multilevel Cox models.
It leverages Poisson process reformulation, negative binomial approximations, and Polya–Gamma data augmentation to yield near-Gaussian full conditionals and efficient Gibbs sampling.
By employing sufficient dimension reduction and ensuring uniform ergodicity, the algorithm provides scalable and robust inference for complex survival modeling applications.

The Cox-Polya-Gamma (Cox-PG) algorithm—also referenced as the COALA-PG algorithm—enables flexible Bayesian inference for multilevel Cox semiparametric regression models. It addresses the computational and modeling challenges associated with monotonicity-constrained, nonparametric estimation of the baseline cumulative hazard function in survival analysis, while accommodating case weights, frailties, and smooth nonparametric covariate effects. The approach leverages Poisson process connections and negative binomial (NB) approximations, together with Polya–Gamma data augmentation and sufficient dimension reduction, to yield a Gibbs sampler with near-Gaussian full conditionals and provable uniform ergodicity. The algorithm is instantiated with open-source R software, supporting practical usage in a range of survival modeling applications (Ren et al., 23 Feb 2024).

1. Model Formulation and Problem Structure

The Cox-PG algorithm is designed for Bayesian Cox proportional hazards (PH) models with multilevel extensions. Traditional semiparametric Cox models express the hazard function for subject $i$ at time $t$ as:

$h_i(t) = h_0(t) \exp(x_i^\top \beta + \cdots),$

where $h_0(t)$ is the (unknown) baseline hazard, $\beta$ denotes covariate effects, and “ $\cdots$ ” denotes additional structures such as frailties, case weights, or smoothing splines.

Key modeling challenges arise from:

The need for nonparametric, monotonic estimation of $H_0(t)$ , the baseline cumulative hazard.
The inclusion of hierarchical or grouped data via frailty terms.
Direct accommodation of smooth or nonparametric effects (e.g., via GAMs).

The core innovation is a hierarchical representation that renders the difficult Cox likelihood tractable for efficient Bayesian inference, while strictly enforcing the positivity and monotonicity constraints of the cumulative hazard.

2. Likelihood Augmentation: Poisson–Negative Binomial and Polya–Gamma

The first major strategy utilizes the equivalence between the Cox model partial likelihood and a discretized Poisson process. The Cox-PG approach reformulates the likelihood

$L_\mathrm{PH}(\eta | y, M) \propto \prod_{i=1}^N \left\{ D_t[z_a(t_i)] u_a \right\}^{y_i} \mathbf{1}\{\eta \in \mathcal{C}_0\} \exp(y^\top M \eta) \exp\left\{ - \sum_{i=1}^N \exp(m_i^\top \eta) \right\},$

where $u_a$ are spline coefficients for the log cumulative hazard, $z_a(t)$ is a basis vector, and $M$ combines spline/covariate design.

A vague gamma prior (frailty) with precision parameter $\epsilon$ is placed on the cumulative hazard increments, representing the Cox kernel as a Poisson–NB mixture. This allows expression of the likelihood kernel in negative binomial form, which is then further augmented via Polya–Gamma latent variables:

$\frac{e^{\psi_i a}}{(1 + e^{\psi_i})^b} = 2^{-b} \exp(\kappa_i \psi_i) \int_0^\infty \exp\left(-\frac{1}{2} \omega_i \psi_i^2 \right) \pi(\omega_i|b, 0) d\omega_i,$

with $\kappa_i = a - b/2$ , $b = y_i + \epsilon$ , and $\omega_i \sim \mathrm{PG}(b, \psi_i)$ . This transformation produces an augmented Gaussian likelihood in $\psi_i = m_i^\top \eta$ , facilitating Gaussian updates in the Gibbs sampler.

3. Dimension Reduction for Nonparametric Baseline Hazard

The second pivotal strategy concerns efficient and constrained estimation of the nonparametric baseline hazard, a traditional bottleneck in Bayesian Cox modeling. The baseline log cumulative hazard $\alpha(t)$ is represented by monotonic splines:

$\alpha(t) = \alpha_0 + z_a(t)^\top u_a$

with

$D_t[z_a(t)] = \delta_j(t) = \mathbb{I}(s_{j-1} \le t < s_j)$

over $J$ non-overlapping intervals (partitions).

To satisfy positivity constraints on $u_{a,j}$ , slice variables $\nu_i \sim \mathrm{Uniform}(0, D_t[z_a(t_i)] u_a)$ are introduced. Critically, it is shown that only the maximal slice in each partition (denote $v_j = \max\{\nu_i: i \text{ in partition } j \text{ with event}\}$ ) constrains the coefficient via $v_j/u_{a,j} \sim \mathrm{Beta}(n_{a,j}, 1)$ , where $n_{a,j}$ is the number of events. This sufficient dimension reduction lowers the constraint complexity from $N$ to $J$ , substantially accelerating truncated Gaussian sampling for monotonic splines.

4. Algorithmic Framework and Computational Advances

The full Gibbs sampling scheme cycles through latent and model parameters, drawing:

Polya–Gamma variables $\omega_i$ ,
Spline coefficients $u_a$ and covariate effects $\beta$ as blocks using truncated normal proposals,
Slice variables and their maxima,
Other random effects (frailties, case weights, etc.) as required.

The full conditional for $\eta = [u_a^\top, \beta^\top]^\top$ becomes:

$\eta | - \sim \mathrm{TN}(\mu, \Sigma, \mathcal{C}),$

where $\Sigma = (M^\top \Omega M)^{-1}$ , $\mu = \Sigma M^\top(\kappa - \eta_\epsilon \omega)$ , $\Omega = \operatorname{diag}(\omega_i)$ , and $\mathcal{C}$ is the intersection of monotonicity and beta-induced positivity constraints.

To correct for the NB approximation bias from the gamma frailty, a Metropolis–Hastings (MH) accept/reject step is introduced, with acceptance probability:

$\min\left\{1, \frac{L_\mathrm{PH}(\eta) \pi_0(\eta) q(\eta'|\eta)}{L_\mathrm{PH}(\eta') \pi_0(\eta') q(\eta|\eta')}\right\}.$

This step targets the true Cox model posterior, ensuring unbiased inference when strict fidelity to the partial likelihood is required.

5. Uniform Ergodicity and Statistical Guarantees

Uniform ergodicity is established for the Cox-PG Gibbs chain under the following conditions:

The gamma prior parameters for random effects (frailties) satisfy $a_0 + M/2 \geq 1$ and the support is bounded away from zero ( $\tau_B \geq \tau_0$ ).
Monotonic spline basis functions for the baseline hazard are Lipschitz and coefficients bounded, $u_a \in (0, u_a^+)$ .

These ensure that the transition kernel $k(\eta|\eta')$ admits a minorization condition:

$k(\eta|\eta') \geq \delta h(\eta)$

for some $\delta > 0$ and density $h(\eta)$ , which yields geometric convergence and validates application of a central limit theorem to Monte Carlo averages over sampled chains. As a consequence, the accuracy of finite-sample Bayesian inference using the Cox-PG algorithm is strongly justified.

6. Applications and Empirical Demonstrations

The methodology is illustrated through several real and simulated datasets:

Kaplan–Meier Approximation: Applied to the “lung” dataset (survival R package), the baseline cumulative hazard estimated with $J=5$ partitions yielded a monotonic posterior mean aligning closely with the nonparametric Kaplan–Meier survival curve.
Simulation Studies: Data generated from the Weibull PH model (with extensions such as frailties, case weights, nonparametric effects, stratified hazards) were modeled by Cox-PG and compared to Bayesian INLA and HMC-based Weibull approaches. Cox-PG matched or outperformed these approaches, particularly when richer structure (GAM, frailty) was present.
Leukemia Dataset: In modeling leukemia survival with risk group stratification, smooth socioeconomic (Townsend score) effects, and site random effects, Cox-PG successfully estimated stratified and nonparametric hazard components, illustrating the algorithm’s flexibility beyond standard Cox frameworks.

7. Software Implementation and Accessibility

An open-source R implementation of the Cox-PG algorithm is provided in supplementary materials, encapsulated in the function posterior_draw(). This leverages the ‘BayesLogit’ package for efficient generation of Polya–Gamma distributed variables and the ‘relliptical’ package for truncated Gaussian sampling under convex constraints. The requirement for only standard R infrastructure, rather than specialized platforms such as Stan or INLA, underscores practical accessibility for applied statisticians and biomedical researchers. The software design, based on direct and efficient data augmentation, supports streamlined model fitting for a variety of complex, multilevel survival analysis scenarios.

In summary, the Cox-PG algorithm synthesizes Poisson—negative binomial approximation, Polya–Gamma augmentation, and sufficient dimension reduction to yield a practical, theoretically guaranteed, and extensible framework for Bayesian inference in Cox-type survival models with intricate baseline and multilevel structure (Ren et al., 23 Feb 2024).

PDF Markdown Chat (Pro)

References (1)

The Cox-Polya-Gamma Algorithm for Flexible Bayesian Inference of Multilevel Survival Models (2024)

Follow Topic

Get notified by email when new papers are published related to COALA-PG Algorithm.