Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 80 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 33 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 117 tok/s Pro

Kimi K2 176 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4.5 32 tok/s Pro

2000 character limit reached

Bayesian Pliable Lasso with Horseshoe Prior

Updated 7 October 2025

The methodology extends the frequentist pliable lasso by incorporating a horseshoe prior to enforce strong heredity for both main and interaction effect selection.
It employs a hierarchical global-local shrinkage framework with Gaussian and inverse-gamma priors, facilitating efficient Gibbs sampling and robust posterior inference.
Empirical results demonstrate improved variable recovery and prediction accuracy compared to traditional lasso methods in high-dimensional regression contexts.

The Bayesian Pliable Lasso with Horseshoe Prior is a hierarchical, global–local shrinkage framework designed to enable sparse estimation and uncertainty quantification for both main and interaction effects in high-dimensional regression and generalized linear models (GLMs). The methodology extends the frequentist pliable lasso, which is noted for its ability to model interactions under strong heredity constraints, by introducing explicit probabilistic modeling of sparsity and effect selection using the horseshoe prior, with extensions to handle missing responses via integrated data augmentation and efficient Gibbs sampling (Mai, 9 Sep 2025).

1. Model Formulation and Motivating Principles

The Bayesian pliable lasso is built to identify a small set of predictors and their interactions with modifying variables while maintaining statistical interpretability and robust uncertainty assessment. The fundamental regression setting supposes observations $(y_i, x_i, Z_i)$ , where $y_i$ is the scalar response, $x_i \in \mathbb{R}^p$ are predictors, and $Z_i \in \mathbb{R}^q$ are modifiers (e.g., categorical covariates, environmental factors, etc.).

The linear predictor for subject $i$ is given by

$\eta_i = \beta_0 + Z_i^\top \theta_0 + \sum_{j=1}^p x_{ij}\left( \beta_j + Z_i^\top \theta_j \right).$

This structure decomposes into main effects ( $\beta$ ), modifier-specific intercepts ( $\theta_0$ ), and interaction/heterogeneity effects ( $\theta_j$ ). The model extends naturally to exponential family likelihoods for GLMs.

The horseshoe prior is placed on both main effects and their corresponding interaction vectors by coupling their shrinkage scales:

For each predictor $j$ $j$ :
- $\beta_j \sim N(0, \lambda_j^2 \tau^2)$
- $\theta_j \sim N(0, \lambda_j^2 \tau^2 I_q)$
- $\lambda_j \sim \mathrm{Half\text{-}Cauchy}(0,1)$

A single global scale parameter $\tau \sim \mathrm{Half\text{-}Cauchy}(0,1)$ further regularizes the collective magnitude of all effects.

The strong heredity constraint is enforced by assigning both main and interaction effects the same local scale $\lambda_j$ , ensuring that an inactive (i.e., heavily shrunk) main effect automatically suppresses its associated interaction terms.

2. Hierarchical Prior Specification and Heredity

The hierarchical global–local prior, central to the horseshoe approach, possesses several defining characteristics:

Very strong spike at zero: induces aggressive shrinkage on noise effects and enforces sparsity.
Extremely heavy tails: allows nonzero effects to "escape" shrinkage and prevents over-suppression of important predictors or interactions.
Hierarchical coupling: the shared local scale $\lambda_j$ for a group (main effect and its interactions) enforces group shrinkage and naturally encodes the strong heredity principle.

This structure is formalized as:

$\begin{align*} \beta_j &\sim N(0, \lambda_j^2 \tau^2), \ \theta_j &\sim N(0, \lambda_j^2 \tau^2 I_q), \ \lambda_j^2 &\sim \mathrm{IG}\left(\frac12, \frac{1}{\nu_j}\right), \quad \nu_j \sim \mathrm{IG}\left(\frac12, 1\right), \ \tau^2 &\sim \mathrm{IG}\left(\frac12, \frac{1}{\xi}\right), \quad \xi \sim \mathrm{IG}\left(\frac12, 1\right). \end{align*}$

The inverse-gamma representation allows efficient conjugate Gibbs updates. The same prior is placed on $\theta_0$ for the modifying variable intercept, and on $\beta_0$ for the unpenalized global intercept.

3. Posterior Computation and Gibbs Sampling

A blockwise Gibbs sampler is constructed to exploit the conjugacy of the Gaussian likelihood and the normal/inverse-gamma prior components. At each iteration, updates proceed as follows:

$\beta_j \mid \cdot \sim N(\mu_{\beta_j}, V_{\beta_j})$ , with $V_{\beta_j} = \big(w_j^\top w_j /\sigma^2 + 1/(\lambda_j^2\tau^2)\big)^{-1}$ and $\mu_{\beta_j}$ the corresponding mean conditional on other effects, current data residuals, and prior parameters.
$\theta_j \mid \cdot \sim N_q(\mu_{\theta_j}, V_{\theta_j})$ , with analogous parameter updates reflecting the penalty and the design structure.
Local scale updates: $\lambda_j^2$ and auxiliary variables $\nu_j$ using their inverse-gamma full conditionals.
Global scale and its auxiliary: $\tau^2$ and $\xi$ via inverse-gamma forms, ensuring adaptation to overall sparsity.
Missing responses: when $y$ has missing values, these are imputed conditionally given predictors and regression effects, $y_i \sim N(\mu_i, \sigma^2)$ (in the Gaussian case) at each iteration.
Intercepts and error variance: sampled from their conjugate distributions.

This blockwise approach maintains computational tractability even for high-dimensional main and modifier spaces (Mai, 9 Sep 2025).

4. Extension to Generalized Linear Models and Missing Data

The model can be generalized for arbitrary exponential family outcomes. For each $i$ :

$y_i \sim \mathrm{ExponentialFamily}(\eta_i),$

with the linear predictor as defined above. The Gibbs sampler is modified so that, where possible, Pólya–Gamma data augmentation or other latent variable approaches are used to maintain conjugacy (as in logistic regression), or Metropolis–Hastings updates are used otherwise.

When response data contain missingness, integrative data augmentation is used: missing $y_i$ are imputed inside the MCMC, using either their full data likelihood (when possible) or conditional predictive draws. This approach naturally yields posterior inference under the observed-data likelihood.

5. Theoretical Properties and Practical Implications

The global–local horseshoe prior structure is supported by a substantial theoretical foundation:

The posterior under the horseshoe prior contracts near-minimax adaptivity in ultra-sparse and high-dimensional regimes (Pas et al., 2017), and the MMLE or hierarchical Bayes construction on $\tau$ yields adaptive contraction without requiring prior knowledge of the sparsity level.
Aggressive shrinkage at zero ensures the exclusion of irrelevant predictors and interactions, with the heavy-tailed prior safeguarding against shrinkage of large signals.
The conjugate hierarchical structure admits direct computation of the marginal likelihood for model selection or hyperparameter tuning (e.g., via Chib's algorithm) (Makalic et al., 2015).
The approach enables interpretable, hierarchically constrained interaction modeling, uncertainty quantification through credible intervals, and full Bayesian inference for both coefficient sets and derived functionals.

In simulation and real-world studies (e.g., neuroimaging and clinical data), the Bayesian pliable horseshoe consistently outperforms standard lasso, frequentist pliable lasso, and classical horseshoe models in variable selection, recovery of true main and interaction structure, and prediction error. Notably, when modifying variables are binary or interaction complexity is high, the hereditary shrinkage mechanism brings substantial advantages for both estimation accuracy and parsimony (Mai, 9 Sep 2025).

6. Computational and Implementation Aspects

The model and inference procedure are implemented in the hspliable R package, which relies on Rcpp and RcppArmadillo for efficient matrix operations and large-scale computation. All essential posterior components are updated in blocks using conjugacy and efficient linear algebra, and the package supports both complete and missing outcome data. Simulation studies and real data examples illustrate the scalability and interpretability of the method, with point estimates and credible intervals highlighting the model's capacity for meaningful uncertainty quantification and interaction recovery.

7. Contextualization within the Shrinkage and Sparse Estimation Literature

The Bayesian pliable lasso with horseshoe prior is positioned as a probabilistic generalization of lasso-type and global–local shrinkage methods:

The horseshoe prior is shown to possess regular variation and polynomial tails, unlike the Laplace (lasso) prior, which leads to reduced bias and improved large-signal recovery in sparse settings (Bhadra et al., 2015, Bhadra et al., 2017).
Variants such as horseshoe+ or deeper product mixtures may yield even tighter concentration and lower MSE in ultra-sparse applications (Bhadra et al., 2015).
Compared to point-mass mixture priors, the horseshoe and its generalizations achieve strong sparsity and computational feasibility in high dimensions without explicit variable selection indicators.
The hierarchical coupling of group shrinkage (heredity constraints) and local adaptivity could, in principle, be extended to even more flexible structures such as groupings, latent hierarchical layers, or graph-based penalties, as is done in some contemporary shrinkage frameworks.

A plausible implication is that the modeling framework of the Bayesian pliable lasso with horseshoe prior can serve as a prototype for generalized structured sparsity and interaction modeling—inheriting the strong theoretical guarantees, efficient posterior computation, and interpretability arising from the global–local shrinkage architecture.

In summary, the Bayesian Pliable Lasso with Horseshoe Prior provides a comprehensive, theoretically justified approach to sparse effect and interaction selection in regression and GLMs, with fully probabilistic treatment of heredity constraints, scalable Gibbs sampling via conjugate hierarchical modeling, and demonstrable improvement over classical and regularized alternatives (Mai, 9 Sep 2025).

PDF Markdown Chat (Pro)

References (6)

Bayesian Pliable Lasso with Horseshoe Prior for Interaction Effects in GLMs with Missing Responses (2025)

Adaptive posterior contraction rates for the horseshoe (2017)

A simple sampler for the horseshoe estimator (2015)

Default Bayesian analysis with global-local shrinkage priors (2015)

Lasso Meets Horseshoe : A Survey (2017)

The Horseshoe+ Estimator of Ultra-Sparse Signals (2015)

Follow Topic

Get notified by email when new papers are published related to Bayesian Pliable Lasso with Horseshoe Prior.

Bayesian Pliable Lasso with Horseshoe Prior

1. Model Formulation and Motivating Principles

2. Hierarchical Prior Specification and Heredity

3. Posterior Computation and Gibbs Sampling

4. Extension to Generalized Linear Models and Missing Data

5. Theoretical Properties and Practical Implications

6. Computational and Implementation Aspects

7. Contextualization within the Shrinkage and Sparse Estimation Literature

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Bayesian Pliable Lasso with Horseshoe Prior

1. Model Formulation and Motivating Principles

2. Hierarchical Prior Specification and Heredity

3. Posterior Computation and Gibbs Sampling

4. Extension to Generalized Linear Models and Missing Data

5. Theoretical Properties and Practical Implications

6. Computational and Implementation Aspects

7. Contextualization within the Shrinkage and Sparse Estimation Literature

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research