Hierarchical Bayesian Extension

Updated 18 November 2025

Hierarchical Bayesian extension is a framework that generalizes classical Bayesian models by incorporating multilevel dependency structures to model correlated proportions.
It employs reduced parametrization and copula-based approximations to efficiently capture first and second order statistics in high-dimensional, overdispersed or underdispersed settings.
The approach enables construction of multivariate credible regions and extends to the Generalised Score Distribution, facilitating practical inference with finite-support discrete data.

A hierarchical Bayesian extension generalizes classical Bayesian models to incorporate hierarchical or multilevel dependency structures in the underlying parameters. In the context of discrete support and binomial-type data, such extensions enable the modeling of correlated proportions, non-exchangeable groupings, and flexible mean-variance relations. Key frameworks include multivariate and hierarchical beta-binomial models, as well as their mean-variance and copula-based parametrizations. Recent work extends these constructions to both the overdispersed and underdispersed regimes and leverages reduced parametrization and copula theory to enable computation in high dimensions.

1. Foundations: Classical and Multivariate Beta-Binomial Frameworks

The starting point for hierarchical Bayesian modeling of proportions is the beta-binomial model. For $n$ binary trials, the probability of success $\vartheta$ is treated as a latent variable with a beta prior:

$\vartheta \sim \mathrm{Beta}(\alpha, \beta), \qquad X_i \mid \vartheta \sim \mathrm{Bernoulli}(\vartheta)$

The marginal likelihood and posterior are given by:

$P(Y = y \mid n, \alpha, \beta) = \binom{n}{y} \frac{B(\alpha + y, \beta + n - y)}{B(\alpha,\beta)}$

$\vartheta \mid Y = y \sim \mathrm{Beta}(\alpha + y, \beta + n - y)$

This model captures overdispersion relative to the binomial, as the variance always exceeds $n\mu(1-\mu)$ unless the precision parameter diverges (Westphal, 2019, Ćmiel et al., 2022, Cheraghchi, 2017).

The multivariate extension maps the joint distribution of $m$ Bernoulli variables to a marginal space via a linear transformation from a $2^m$ -dimensional Dirichlet:

$\mathbf{p} \sim \mathrm{Dirichlet}(\gamma_1, \dots, \gamma_{2^m}), \quad \bm{\vartheta} = H \mathbf{p}$

Here, $H$ encodes the mapping from latent categories to Bernoulli marginals, inducing an $m$ -dimensional marginal distribution over $(0,1)^m$ (Westphal, 2019).

2. Hierarchical Construction and Reduced Parametrization

As the full Dirichlet representation is computationally infeasible for large $m$ , hierarchical Bayesian extensions favor reduced parametrizations by encoding the structure through first and second moments. Specifically, the model can be described by the total mass $\nu = \sum_k \gamma_k$ and the matrix $A = H\,\mathrm{diag}(\gamma) H^\mathsf{T}$ encoding first and second order sufficient statistics:

Mean: $E[\vartheta_j] = \alpha_j / \nu$
Covariance: $\mathrm{Cov}(\vartheta_j, \vartheta_{j'}) = \frac{\nu A_{jj'} - \alpha_j \alpha_{j'}}{\nu^2(\nu+1)}$

The posterior under multinomial sampling and a Dirichlet prior, using observed cell counts $\mathbf{d}$ , has straightforward updates:

$\nu^* = \nu + n, \qquad A^* = A + H\,\mathrm{diag}(\mathbf{d})\,H^\mathsf{T}$

This approach allows joint shrinkage estimation of the mean vector and covariance structure without the full $2^m$ parameterization (Westphal, 2019).

3. Copula and Gaussian Approximations

Exact joint posteriors for $\bm{\vartheta}$ are generally unavailable in closed form. Hierarchical models address this using the copula approximation, retaining exact marginal Beta posteriors and capturing dependencies through the posterior correlation matrix $R^*$ :

$\tilde f(\theta_1, \ldots, \theta_m) = c_{R^*}(F_1(\theta_1), \ldots, F_m(\theta_m)) \prod_{j=1}^m f_j(\theta_j)$

Here, $F_j$ and $f_j$ are the Beta CDF/density for the $j$ th margin, and $c_{R^*}$ is the Gaussian copula density. For computational efficiency in very high dimensions, a normal approximation $N_m(\mu^*, \Sigma^*)$ may be used, but boundary-respecting copula regions are typically preferred, especially for small $n$ or parameters near the unit cube boundary (Westphal, 2019).

4. Multivariate Credible Regions and Coverage Properties

Hierarchical Bayesian extensions facilitate the construction of multivariate credible regions for simultaneous inference. Approaches include:

Full model (via Dirichlet sampling): empirical quantiles of $\bm{\vartheta}^{(r)}$ over Dirichlet samples
Copula-based: hyperrectangular regions from Beta marginal quantiles and copula dependence via the threshold $c_\alpha$ solving $\Pr(\max_j |Z_j| \le c_\alpha) = 1-\alpha$
Normal approximation: rectangles based on the mean and covariance under the normal approximation

Empirical results indicate the copula and full model approaches yield Bayes coverage close to target levels, outperforming normal approximations in small-sample or boundary scenarios (Westphal, 2019).

5. Extensions Beyond Overdispersion: The Generalised Score Distribution

Classical Bayesian hierarchical models using the beta-binomial component are limited to overdispersed regimes relative to the binomial. The Generalised Score Distribution (GSD) extends this by providing a two-parameter family over finite support $\{0, \ldots, m\}$ with:

$\mu = E[U] \in [0, m], \qquad \delta \in [0,1]$

For fixed $\mu$ , $\delta$ interpolates the variance from the minimal (attained by a two-point distribution) to the maximal (attained by extremes), covering the entire feasible variance interval $[V_{\min}(\mu), V_{\max}(\mu)]$ . For $\delta \leq C(\mu)$ , the GSD matches a reparametrized beta-binomial; for $\delta > C(\mu)$ , it smoothly continues into the underdispersed regime, unattainable by classical beta-binomial models. Estimation is feasible via method of moments or maximum likelihood (Ćmiel et al., 2022).

6. Entropy and Analytic Properties

The entropy of hierarchical Bayesian extensions with beta-binomial marginals and their generalizations can be expressed both as series and integrals. For the beta-binomial law $X \sim \mathrm{BetaBin}(n,\alpha, \beta)$ , the entropy is:

$H(X) = -\ln(n B(\alpha,\beta)) + \sum_{j=2}^{n} \binom{n}{j}(-1)^j \left[ \frac{(\alpha)_j}{(\alpha+\beta)_j} (c(j)-c_\alpha(j)) + \frac{(\beta)_j}{(\alpha+\beta)_j}(c(j)-c_\beta(j)) \right]$

where $c_\alpha(j)$ are “difference-coefficient” sequences related to Riemann and Hurwitz $\zeta$ functions. An equivalent integral representation is available involving the hypergeometric function and Lerch transcendent (Cheraghchi, 2017). These analytic expressions facilitate precise computation of information-theoretic quantities in hierarchical models.

7. Bernoulli-Sum Interpretation and Practical Inference

Every GSD, including both extended beta-binomial and its underdispersed continuation, admits a representation as a sum of $m$ dichotomous random variables—potentially dependent—generalizing the de Finetti (Beta-mixing) representation. For $\delta \leq C(\mu)$ , the classical beta-binomial structure reappears, while for $\delta > C(\mu)$ , the law transitions to mixtures incorporating underdispersed two-point and binomial components. Practical inference employs moment matching, maximum likelihood, and boundary-respecting copula regions to handle high dimensionality and support computational tractability (Ćmiel et al., 2022, Westphal, 2019).

Summary Table: Hierarchical Bayesian Extensions for Discrete Data

Feature	Standard Beta-Binomial	Multivariate/Hierarchical Bayesian Extension	Generalised Score Distribution (GSD)
Dispersion Control	Overdispersion	Overdispersion, multivariate dependence	Over- and underdispersion
Parametrization	$(\alpha,\beta)$ or $(\mu,\phi)$	Dirichlet ( $2^m$ ) or reduced ( $\nu,A$ )	$(\mu, \delta)$
Computational Feasibility	High	Moderate (low $m$ ); reduced and copula for high $m$	High
Entropy, Info-Theoretic Props	Series/integral formula	Copula and Dirichlet-based	Admits Bernoulli-sum representation
Simultaneous Inference	Marginal	Multivariate credible regions (copula, normal, full)	General mean-variance coverage

Hierarchical Bayesian extensions thus provide a rigorous basis for the joint modeling of correlated proportions and support the construction of credible regions, flexible mean-variance interpolation, and tractable computation in high dimensions. Recent developments, including the GSD, address both overdispersed and underdispersed settings, expand the representational range, and offer robust inference tools for practical applications involving finite-support discrete data (Westphal, 2019, Ćmiel et al., 2022, Cheraghchi, 2017).

PDF Markdown Chat (Pro)

References (3)

Simultaneous Inference for Multiple Proportions: A Multivariate Beta-Binomial Model (2019)

Generalised Score Distribution: Underdispersed Continuation of the Beta-Binomial Distribution (2022)

Expressions for the Entropy of Binomial-Type Distributions (2017)

Follow Topic

Get notified by email when new papers are published related to Hierarchical Bayesian Extension.