Hierarchical Bayesian Extension
- Hierarchical Bayesian extension is a framework that generalizes classical Bayesian models by incorporating multilevel dependency structures to model correlated proportions.
- It employs reduced parametrization and copula-based approximations to efficiently capture first and second order statistics in high-dimensional, overdispersed or underdispersed settings.
- The approach enables construction of multivariate credible regions and extends to the Generalised Score Distribution, facilitating practical inference with finite-support discrete data.
A hierarchical Bayesian extension generalizes classical Bayesian models to incorporate hierarchical or multilevel dependency structures in the underlying parameters. In the context of discrete support and binomial-type data, such extensions enable the modeling of correlated proportions, non-exchangeable groupings, and flexible mean-variance relations. Key frameworks include multivariate and hierarchical beta-binomial models, as well as their mean-variance and copula-based parametrizations. Recent work extends these constructions to both the overdispersed and underdispersed regimes and leverages reduced parametrization and copula theory to enable computation in high dimensions.
1. Foundations: Classical and Multivariate Beta-Binomial Frameworks
The starting point for hierarchical Bayesian modeling of proportions is the beta-binomial model. For binary trials, the probability of success is treated as a latent variable with a beta prior:
The marginal likelihood and posterior are given by:
This model captures overdispersion relative to the binomial, as the variance always exceeds unless the precision parameter diverges (Westphal, 2019, Ćmiel et al., 2022, Cheraghchi, 2017).
The multivariate extension maps the joint distribution of Bernoulli variables to a marginal space via a linear transformation from a -dimensional Dirichlet:
Here, encodes the mapping from latent categories to Bernoulli marginals, inducing an -dimensional marginal distribution over (Westphal, 2019).
2. Hierarchical Construction and Reduced Parametrization
As the full Dirichlet representation is computationally infeasible for large , hierarchical Bayesian extensions favor reduced parametrizations by encoding the structure through first and second moments. Specifically, the model can be described by the total mass and the matrix encoding first and second order sufficient statistics:
- Mean:
- Covariance:
The posterior under multinomial sampling and a Dirichlet prior, using observed cell counts , has straightforward updates:
This approach allows joint shrinkage estimation of the mean vector and covariance structure without the full parameterization (Westphal, 2019).
3. Copula and Gaussian Approximations
Exact joint posteriors for are generally unavailable in closed form. Hierarchical models address this using the copula approximation, retaining exact marginal Beta posteriors and capturing dependencies through the posterior correlation matrix :
Here, and are the Beta CDF/density for the th margin, and is the Gaussian copula density. For computational efficiency in very high dimensions, a normal approximation may be used, but boundary-respecting copula regions are typically preferred, especially for small or parameters near the unit cube boundary (Westphal, 2019).
4. Multivariate Credible Regions and Coverage Properties
Hierarchical Bayesian extensions facilitate the construction of multivariate credible regions for simultaneous inference. Approaches include:
- Full model (via Dirichlet sampling): empirical quantiles of over Dirichlet samples
- Copula-based: hyperrectangular regions from Beta marginal quantiles and copula dependence via the threshold solving
- Normal approximation: rectangles based on the mean and covariance under the normal approximation
Empirical results indicate the copula and full model approaches yield Bayes coverage close to target levels, outperforming normal approximations in small-sample or boundary scenarios (Westphal, 2019).
5. Extensions Beyond Overdispersion: The Generalised Score Distribution
Classical Bayesian hierarchical models using the beta-binomial component are limited to overdispersed regimes relative to the binomial. The Generalised Score Distribution (GSD) extends this by providing a two-parameter family over finite support with:
For fixed , interpolates the variance from the minimal (attained by a two-point distribution) to the maximal (attained by extremes), covering the entire feasible variance interval . For , the GSD matches a reparametrized beta-binomial; for , it smoothly continues into the underdispersed regime, unattainable by classical beta-binomial models. Estimation is feasible via method of moments or maximum likelihood (Ćmiel et al., 2022).
6. Entropy and Analytic Properties
The entropy of hierarchical Bayesian extensions with beta-binomial marginals and their generalizations can be expressed both as series and integrals. For the beta-binomial law , the entropy is:
where are “difference-coefficient” sequences related to Riemann and Hurwitz functions. An equivalent integral representation is available involving the hypergeometric function and Lerch transcendent (Cheraghchi, 2017). These analytic expressions facilitate precise computation of information-theoretic quantities in hierarchical models.
7. Bernoulli-Sum Interpretation and Practical Inference
Every GSD, including both extended beta-binomial and its underdispersed continuation, admits a representation as a sum of dichotomous random variables—potentially dependent—generalizing the de Finetti (Beta-mixing) representation. For , the classical beta-binomial structure reappears, while for , the law transitions to mixtures incorporating underdispersed two-point and binomial components. Practical inference employs moment matching, maximum likelihood, and boundary-respecting copula regions to handle high dimensionality and support computational tractability (Ćmiel et al., 2022, Westphal, 2019).
Summary Table: Hierarchical Bayesian Extensions for Discrete Data
| Feature | Standard Beta-Binomial | Multivariate/Hierarchical Bayesian Extension | Generalised Score Distribution (GSD) |
|---|---|---|---|
| Dispersion Control | Overdispersion | Overdispersion, multivariate dependence | Over- and underdispersion |
| Parametrization | or | Dirichlet () or reduced () | |
| Computational Feasibility | High | Moderate (low ); reduced and copula for high | High |
| Entropy, Info-Theoretic Props | Series/integral formula | Copula and Dirichlet-based | Admits Bernoulli-sum representation |
| Simultaneous Inference | Marginal | Multivariate credible regions (copula, normal, full) | General mean-variance coverage |
Hierarchical Bayesian extensions thus provide a rigorous basis for the joint modeling of correlated proportions and support the construction of credible regions, flexible mean-variance interpolation, and tractable computation in high dimensions. Recent developments, including the GSD, address both overdispersed and underdispersed settings, expand the representational range, and offer robust inference tools for practical applications involving finite-support discrete data (Westphal, 2019, Ćmiel et al., 2022, Cheraghchi, 2017).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free