Censored Graphical Horseshoe (CGHS)

Updated 14 January 2026

Censored Graphical Horseshoe is a Bayesian method that extends the Graphical Horseshoe to estimate sparse precision matrices in Gaussian graphical models with censored and missing observations.
It employs a latent-variable strategy combined with global-local Horseshoe shrinkage to enable efficient posterior inference in complex, high-dimensional data.
Empirical studies show that CGHS achieves lower estimation errors, higher true positive rates, and near-zero false discovery rates across varied censoring regimes.

The Censored Graphical Horseshoe (CGHS) is a Bayesian framework for sparse precision matrix estimation in Gaussian graphical models, designed to accommodate data subject to censoring and arbitrary missingness. CGHS generalizes the Graphical Horseshoe (GHS) method, extending its sparse Bayesian regression capabilities to cases where some variables are only partially observed due to detection limits or absences in the measurement process. By introducing a latent variable augmentation scheme and leveraging the adaptive global-local shrinkage properties of the Horseshoe prior, CGHS enables efficient posterior inference even under incomplete data modalities prevalent in biomedical, environmental, and other data-rich scientific domains (Mai et al., 10 Jan 2026).

1. Problem Formulation

CGHS addresses inference for precision matrices $\Omega \succ 0$ in mean-zero Gaussian graphical models, given $n$ i.i.d. samples $Y_i \in \mathbb{R}^p$ . In many domains, such as qPCR, environmental assays, and single-cell studies, not all $Y_{ij}$ are fully observed: measurements may be censored at left thresholds $c_j$ (e.g., detection limits), or may be entirely missing. Formally, each recorded datum $\widetilde{Y}_{ij}$ is defined by

$\widetilde{Y}_{ij} = \begin{cases} Y_{ij}, & Y_{ij} > c_j, \ c_j, & Y_{ij} \le c_j, \end{cases}$

with additional missingness. Observed, censored, and missing indices for sample $i$ are given by sets $\mathcal{O}_i$ , $\mathcal{C}_i$ , and $\mathcal{M}_i$ . The observed-data likelihood for left censoring,

$L(\Omega; \widetilde{Y}) = \prod_{i=1}^{n} \phi_{|\mathcal{O}_i|}\left( \widetilde{Y}_{i,\mathcal{O}_i}; 0,\Sigma_{\mathcal{O}_i,\mathcal{O}_i} \right) \times \Phi_{|\mathcal{C}_i|}\left( c_{\mathcal{C}_i} \mid \mu_{i,\mathcal{C}_i|\mathcal{O}_i}, \Sigma_{\mathcal{C}_i|\mathcal{O}_i} \right),$

comprehensively models the incomplete observation structure. This likelihood recovers the fully-observed model when censoring and missingness are absent.

2. Latent-Variable Representation

To facilitate posterior inference under censoring and missingness, CGHS employs a latent-variable strategy by introducing $Z \in \mathbb{R}^{n \times p}$ with $Z_i \sim \mathcal{N}_p(0, \Sigma)$ . Observed data arise via deterministic or truncated mappings from $Z_{ij}$ : $\widetilde{Y}_{ij} = \begin{cases} Z_{ij}, & Z_{ij} > c_j~\text{and observed},\ c_j, & Z_{ij} \le c_j,\ \text{NA}, & Z_{ij}~\text{missing}. \end{cases}$ The joint density $p(Z, \widetilde{Y} \mid \Omega)$ factors as a product of multivariate normals and indicator functions truncating censored and missing entries. Posterior computation alternates between imputation (sampling censored/missing $Z_{ij}$ ) and precision matrix updates, enabling inference even when entire data blocks are unobserved.

3. Horseshoe Prior and Model Specification

For inducing sparsity, CGHS places a global-local Horseshoe prior on off-diagonal elements $\omega_{jk}$ $(j < k)$ of $\Omega$ : $\omega_{jk} \mid \lambda_{jk}, \tau \sim \mathcal{N}(0, \lambda_{jk}^2 \tau^2), \qquad \lambda_{jk} \sim \mathcal{C}^+(0, 1), \qquad \tau \sim \mathcal{C}^+(0, 1).$ Diagonal elements $\omega_{jj}$ are assigned weakly informative priors or estimated via nodewise residual variances, subject to $\Omega \succ 0$ . This prior structure yields adaptive shrinkage, favoring near-zero estimates for non-edges while preserving large signals, and is robust to the inclusion of censored or missing measurements.

4. Posterior Computation via Gibbs Sampling

CGHS exploits nodewise regression for efficient Gibbs sampling. The reparameterization: $Z_{\cdot j} \mid Z_{\cdot,-j} \sim \mathcal{N}(Z_{\cdot,-j} \theta_j, \sigma_j^2 I_n)$ yields $\Omega_{jj} = 1/\sigma_j^2$ and $\Omega_{-j,j} = -\theta_j / \sigma_j^2$ . Sampling proceeds iteratively:

Latent $Z$ Imputation: If $\widetilde{Y}_{ij} > c_j$ , $Z_{ij}$ set to observed value; if $\widetilde{Y}_{ij} = c_j$ , sample $Z_{ij}$ from truncated normal; if missing, sample from full normal conditional.
Regression Coefficients $\theta_j$ : Sample as $\theta_j \mid \cdots \sim \mathcal{N}(\mu_{\theta_j}, \sigma_j^2 V_j)$ , with $V_j$ and $\mu_{\theta_j}$ computed from the design matrix and Horseshoe scales.
Residual Variance $\sigma_j^2$ : Inverse-Gamma update according to nodewise residuals.
Local/Global Scales $(\lambda_{jk}^2, \nu_{jk}, \tau_j^2, \xi_j)$ : Sample via latent-inverse-Gamma parameterization, enabling efficient mixing.
Precision Matrix Reconstruction: Aggregate nodewise estimates and average $\Omega_{jk}$ and $\Omega_{kj}$ ; ensure positive-definiteness.

This block Gibbs sampler efficiently explores the posterior, with update complexity scaling as $O(p^3)$ per iteration.

5. Theoretical Properties

Under standard high-dimensional conditions—bounded eigenvalues, sparsity of $\Omega_0$ , and a mild curvature assumption—the tempered posterior

$\pi_{n,\alpha}(\Omega) \propto L(\Omega)^\alpha \pi_{HS}(\Omega)$

concentrates around the true precision matrix at rate $\varepsilon_n \asymp (n^{-1}s \log p)$ , identical to rates obtained in the uncensored graphical model regime. Precisely, for any $\alpha \in (0,1)$ ,

$\mathbb{E}\left[ \int D_\alpha(P_{\Omega}, P_{\Omega_0}) ~ \pi_{n,\alpha}(d\Omega) \right] \leq \frac{1+\alpha}{1-\alpha} K \frac{s\log p}{n},$

where $D_\alpha$ is the Rényi divergence. This result extends to arbitrary missingness as well. Proofs leverage Taylor expansions of the likelihood, control over Kullback–Leibler neighborhoods under Horseshoe priors, and general concentration results for tempered posteriors.

6. Empirical Studies and Methodological Comparisons

CGHS has been benchmarked against the penalized censored graphical lasso (cglasso, Augugliaro et al.) in two canonical precision matrix regimes:

Tridiagonal (chain) structure: $\omega_{j,j\pm 1} = 0.3$ , other off-diagonals zero.
Block structure: $3 \times 3$ fully connected block, zeros elsewhere.

Studies varied $p \in \{10, 20, 30\}$ , sample size $n \in \{200, 500, 1000\}$ , and censoring/missing rates $\in \{10\%, 20\%, 30\%\}$ . Metrics included squared Frobenius error $\|\widehat{\Omega} - \Omega_{\rm true}\|_F^2$ , true positive rate (TPR), and false discovery rate (FDR). CGHS consistently achieved lower estimation error and higher TPR with near-zero FDR, especially as dimensionality or missingness increased. In the chain graph regime, cglasso frequently failed to recover the target sparsity pattern.

7. Implementation Details

Sampling Algorithms: Truncated-normal values for censored $Z_{ij}$ are sampled via inverse-CDF methods, with exponential-tilting for extreme tail accuracy.
Linear Algebra Optimizations: Nodewise regressions employ Cholesky decompositions for computational efficiency.
Priors: Residual variance hyperparameters $(a_0, b_0) = (10^{-2}, 10^{-2})$ ensure weak informativeness.
Convergence Diagnostics: Empirical traces, autocorrelation, and effective sample size diagnostics show rapid mixing with burn-in periods of approximately 1,000 iterations out of 5,000.
Computational Complexity: The dominant cost is $O(p^3)$ per iteration due to $p$ regression steps.
Software: The R package GHScenmis (https://github.com/tienmt/ghscenmis) supports both censored (cenGHS_censored) and missing (cenGHS_missing) data, providing practical tools for application.

CGHS thus offers a principled, efficient, and theoretically robust approach for Bayesian graphical model estimation under censoring and missingness, extending shrinkage benefits of the Horseshoe prior to settings where frequentist Lasso-based methods are inadequate (Mai et al., 10 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Censored Graphical Horseshoe: Bayesian sparse precision matrix estimation with censored and missing data (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Censored Graphical Horseshoe (CGHS).