Papers
Topics
Authors
Recent
Search
2000 character limit reached

Censored Graphical Horseshoe (CGHS)

Updated 14 January 2026
  • Censored Graphical Horseshoe is a Bayesian method that extends the Graphical Horseshoe to estimate sparse precision matrices in Gaussian graphical models with censored and missing observations.
  • It employs a latent-variable strategy combined with global-local Horseshoe shrinkage to enable efficient posterior inference in complex, high-dimensional data.
  • Empirical studies show that CGHS achieves lower estimation errors, higher true positive rates, and near-zero false discovery rates across varied censoring regimes.

The Censored Graphical Horseshoe (CGHS) is a Bayesian framework for sparse precision matrix estimation in Gaussian graphical models, designed to accommodate data subject to censoring and arbitrary missingness. CGHS generalizes the Graphical Horseshoe (GHS) method, extending its sparse Bayesian regression capabilities to cases where some variables are only partially observed due to detection limits or absences in the measurement process. By introducing a latent variable augmentation scheme and leveraging the adaptive global-local shrinkage properties of the Horseshoe prior, CGHS enables efficient posterior inference even under incomplete data modalities prevalent in biomedical, environmental, and other data-rich scientific domains (Mai et al., 10 Jan 2026).

1. Problem Formulation

CGHS addresses inference for precision matrices Ω0\Omega \succ 0 in mean-zero Gaussian graphical models, given nn i.i.d. samples YiRpY_i \in \mathbb{R}^p. In many domains, such as qPCR, environmental assays, and single-cell studies, not all YijY_{ij} are fully observed: measurements may be censored at left thresholds cjc_j (e.g., detection limits), or may be entirely missing. Formally, each recorded datum Y~ij\widetilde{Y}_{ij} is defined by

Y~ij={Yij,Yij>cj, cj,Yijcj,\widetilde{Y}_{ij} = \begin{cases} Y_{ij}, & Y_{ij} > c_j, \ c_j, & Y_{ij} \le c_j, \end{cases}

with additional missingness. Observed, censored, and missing indices for sample ii are given by sets Oi\mathcal{O}_i, Ci\mathcal{C}_i, and Mi\mathcal{M}_i. The observed-data likelihood for left censoring,

L(Ω;Y~)=i=1nϕOi(Y~i,Oi;0,ΣOi,Oi)×ΦCi(cCiμi,CiOi,ΣCiOi),L(\Omega; \widetilde{Y}) = \prod_{i=1}^{n} \phi_{|\mathcal{O}_i|}\left( \widetilde{Y}_{i,\mathcal{O}_i}; 0,\Sigma_{\mathcal{O}_i,\mathcal{O}_i} \right) \times \Phi_{|\mathcal{C}_i|}\left( c_{\mathcal{C}_i} \mid \mu_{i,\mathcal{C}_i|\mathcal{O}_i}, \Sigma_{\mathcal{C}_i|\mathcal{O}_i} \right),

comprehensively models the incomplete observation structure. This likelihood recovers the fully-observed model when censoring and missingness are absent.

2. Latent-Variable Representation

To facilitate posterior inference under censoring and missingness, CGHS employs a latent-variable strategy by introducing ZRn×pZ \in \mathbb{R}^{n \times p} with ZiNp(0,Σ)Z_i \sim \mathcal{N}_p(0, \Sigma). Observed data arise via deterministic or truncated mappings from ZijZ_{ij}: Y~ij={Zij,Zij>cj and observed, cj,Zijcj, NA,Zij missing.\widetilde{Y}_{ij} = \begin{cases} Z_{ij}, & Z_{ij} > c_j~\text{and observed},\ c_j, & Z_{ij} \le c_j,\ \text{NA}, & Z_{ij}~\text{missing}. \end{cases} The joint density p(Z,Y~Ω)p(Z, \widetilde{Y} \mid \Omega) factors as a product of multivariate normals and indicator functions truncating censored and missing entries. Posterior computation alternates between imputation (sampling censored/missing ZijZ_{ij}) and precision matrix updates, enabling inference even when entire data blocks are unobserved.

3. Horseshoe Prior and Model Specification

For inducing sparsity, CGHS places a global-local Horseshoe prior on off-diagonal elements ωjk\omega_{jk} (j<k)(j < k) of Ω\Omega: ωjkλjk,τN(0,λjk2τ2),λjkC+(0,1),τC+(0,1).\omega_{jk} \mid \lambda_{jk}, \tau \sim \mathcal{N}(0, \lambda_{jk}^2 \tau^2), \qquad \lambda_{jk} \sim \mathcal{C}^+(0, 1), \qquad \tau \sim \mathcal{C}^+(0, 1). Diagonal elements ωjj\omega_{jj} are assigned weakly informative priors or estimated via nodewise residual variances, subject to Ω0\Omega \succ 0. This prior structure yields adaptive shrinkage, favoring near-zero estimates for non-edges while preserving large signals, and is robust to the inclusion of censored or missing measurements.

4. Posterior Computation via Gibbs Sampling

CGHS exploits nodewise regression for efficient Gibbs sampling. The reparameterization: ZjZ,jN(Z,jθj,σj2In)Z_{\cdot j} \mid Z_{\cdot,-j} \sim \mathcal{N}(Z_{\cdot,-j} \theta_j, \sigma_j^2 I_n) yields Ωjj=1/σj2\Omega_{jj} = 1/\sigma_j^2 and Ωj,j=θj/σj2\Omega_{-j,j} = -\theta_j / \sigma_j^2. Sampling proceeds iteratively:

  • Latent ZZ Imputation: If Y~ij>cj\widetilde{Y}_{ij} > c_j, ZijZ_{ij} set to observed value; if Y~ij=cj\widetilde{Y}_{ij} = c_j, sample ZijZ_{ij} from truncated normal; if missing, sample from full normal conditional.
  • Regression Coefficients θj\theta_j: Sample as θjN(μθj,σj2Vj)\theta_j \mid \cdots \sim \mathcal{N}(\mu_{\theta_j}, \sigma_j^2 V_j), with VjV_j and μθj\mu_{\theta_j} computed from the design matrix and Horseshoe scales.
  • Residual Variance σj2\sigma_j^2: Inverse-Gamma update according to nodewise residuals.
  • Local/Global Scales (λjk2,νjk,τj2,ξj)(\lambda_{jk}^2, \nu_{jk}, \tau_j^2, \xi_j): Sample via latent-inverse-Gamma parameterization, enabling efficient mixing.
  • Precision Matrix Reconstruction: Aggregate nodewise estimates and average Ωjk\Omega_{jk} and Ωkj\Omega_{kj}; ensure positive-definiteness.

This block Gibbs sampler efficiently explores the posterior, with update complexity scaling as O(p3)O(p^3) per iteration.

5. Theoretical Properties

Under standard high-dimensional conditions—bounded eigenvalues, sparsity of Ω0\Omega_0, and a mild curvature assumption—the tempered posterior

πn,α(Ω)L(Ω)απHS(Ω)\pi_{n,\alpha}(\Omega) \propto L(\Omega)^\alpha \pi_{HS}(\Omega)

concentrates around the true precision matrix at rate εn(n1slogp)\varepsilon_n \asymp (n^{-1}s \log p), identical to rates obtained in the uncensored graphical model regime. Precisely, for any α(0,1)\alpha \in (0,1),

E[Dα(PΩ,PΩ0) πn,α(dΩ)]1+α1αKslogpn,\mathbb{E}\left[ \int D_\alpha(P_{\Omega}, P_{\Omega_0}) ~ \pi_{n,\alpha}(d\Omega) \right] \leq \frac{1+\alpha}{1-\alpha} K \frac{s\log p}{n},

where DαD_\alpha is the Rényi divergence. This result extends to arbitrary missingness as well. Proofs leverage Taylor expansions of the likelihood, control over Kullback–Leibler neighborhoods under Horseshoe priors, and general concentration results for tempered posteriors.

6. Empirical Studies and Methodological Comparisons

CGHS has been benchmarked against the penalized censored graphical lasso (cglasso, Augugliaro et al.) in two canonical precision matrix regimes:

  • Tridiagonal (chain) structure: ωj,j±1=0.3\omega_{j,j\pm 1} = 0.3, other off-diagonals zero.
  • Block structure: 3×33 \times 3 fully connected block, zeros elsewhere.

Studies varied p{10,20,30}p \in \{10, 20, 30\}, sample size n{200,500,1000}n \in \{200, 500, 1000\}, and censoring/missing rates {10%,20%,30%}\in \{10\%, 20\%, 30\%\}. Metrics included squared Frobenius error Ω^ΩtrueF2\|\widehat{\Omega} - \Omega_{\rm true}\|_F^2, true positive rate (TPR), and false discovery rate (FDR). CGHS consistently achieved lower estimation error and higher TPR with near-zero FDR, especially as dimensionality or missingness increased. In the chain graph regime, cglasso frequently failed to recover the target sparsity pattern.

7. Implementation Details

  • Sampling Algorithms: Truncated-normal values for censored ZijZ_{ij} are sampled via inverse-CDF methods, with exponential-tilting for extreme tail accuracy.
  • Linear Algebra Optimizations: Nodewise regressions employ Cholesky decompositions for computational efficiency.
  • Priors: Residual variance hyperparameters (a0,b0)=(102,102)(a_0, b_0) = (10^{-2}, 10^{-2}) ensure weak informativeness.
  • Convergence Diagnostics: Empirical traces, autocorrelation, and effective sample size diagnostics show rapid mixing with burn-in periods of approximately 1,000 iterations out of 5,000.
  • Computational Complexity: The dominant cost is O(p3)O(p^3) per iteration due to pp regression steps.
  • Software: The R package GHScenmis (https://github.com/tienmt/ghscenmis) supports both censored (cenGHS_censored) and missing (cenGHS_missing) data, providing practical tools for application.

CGHS thus offers a principled, efficient, and theoretically robust approach for Bayesian graphical model estimation under censoring and missingness, extending shrinkage benefits of the Horseshoe prior to settings where frequentist Lasso-based methods are inadequate (Mai et al., 10 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Censored Graphical Horseshoe (CGHS).