Papers
Topics
Authors
Recent
2000 character limit reached

Parametric Bayesian TOST

Updated 10 December 2025
  • Parametric Bayesian TOST is an equivalence testing method that integrates Bayesian posterior inference with the classical TOST framework to assess practical equivalence.
  • It replaces frequentist p-values with posterior tail probabilities derived from continuous priors, ensuring valid uniformity under the null hypothesis.
  • The approach involves specifying equivalence bounds, computing posterior probabilities, and balancing type I error control with test power across parametric models like normal and binomial.

Parametric Bayesian Two One-Sided Tests (TOST) integrate Bayesian posterior inference into the classical TOST procedure for equivalence testing, replacing frequentist pp-values by posterior tail probabilities. This allows conclusions about practical equivalence (rather than simple difference) to be drawn via Bayesian measures of evidence, with posterior probabilities that are valid uniform(0,1)(0,1) test statistics under the null, supporting both single-hypothesis and multiple-testing contexts. The methodology offers direct control over the tradeoff between type I error, test power, and prior informativeness, and is applicable to a range of parametric models including normal and binomial families (Ochieng, 25 Jul 2025, Meyer et al., 2021).

1. Parametric Model Structure and Equivalence Hypothesis Formulation

The parametric Bayesian TOST operates on data X=(X1,,Xn)X = (X_1,\ldots,X_n)^\top sampled i.i.d. from a family {f(xθ):θΘR}\{f(x|\theta): \theta \in \Theta \subseteq \mathbb{R}\}. The null hypothesis for equivalence is split across two margins θ1<θ2\theta_1^* < \theta_2^*:

  • H0H_0: θ(θ1,θ2)\theta \notin (\theta_1^*, \theta_2^*)
  • H1H_1: θ(θ1,θ2)\theta \in (\theta_1^*, \theta_2^*)

This is further decomposed into two one-sided nulls for TOST:

  • HrH_r: θθ1\theta \le \theta_1^* versus KrK_r: θ>θ1\theta > \theta_1^*
  • HlH_l: θθ2\theta \ge \theta_2^* versus KlK_l: θ<θ2\theta < \theta_2^*

A statistic T(X)T(X) with the monotone likelihood-ratio property in θ\theta underpins the construction, supporting the derivation of pivotal quantities (Ochieng, 25 Jul 2025). This setting encompasses scenarios such as comparing two means, two proportions, and equivalence of a mean to a reference value (Meyer et al., 2021).

2. Bayesian Posterior Tail Probabilities and Decision Rules

A continuous prior π(θ)\pi(\theta) is placed on Θ\Theta, yielding posterior π(θx)\pi(\theta|x) via

π(θx)=f(xθ)π(θ)Θf(xθ)π(θ)dθ.\pi(\theta|x) = \frac{f(x|\theta)\pi(\theta)}{\int_\Theta f(x|\theta)\pi(\theta)\,d\theta}.

The Bayesian analogs of one-sided pp-values are the posterior left and right tail probabilities: PrB(x)=θ1π(θx)dθ,PlB(x)=θ2π(θx)dθ.P_r^B(x) = \int_{-\infty}^{\theta_1^*}\pi(\theta|x)d\theta, \qquad P_l^B(x) = \int_{\theta_2^*}^{\infty}\pi(\theta|x)d\theta.

The overall Bayesian “p-value” for equivalence is

Pb(x)=PrB(x)+PlB(x)=1Pr(θ1<θ<θ2x).P_b(x) = P_r^B(x) + P_l^B(x) = 1 - \Pr(\theta_1^* < \theta < \theta_2^* | x).

Equivalence is declared when both PrB(x)αP_r^B(x) \le \alpha and PlB(x)αP_l^B(x) \le \alpha for chosen α\alpha (often α=0.05\alpha=0.05), precisely parallel to the frequentist TOST procedure (Ochieng, 25 Jul 2025).

3. Uniformity and Validity of Bayesian TOST Statistics

For monotone likelihood-ratio families and continuous priors, each tail posterior probability is uniformly distributed under its null:

  • Ur=PrB(X)U_r = P_r^B(X) is uniform(0,1)(0,1) when θ=θ1\theta=\theta_1^*
  • Ul=PlB(X)U_l = P_l^B(X) is uniform(0,1)(0,1) when θ=θ2\theta=\theta_2^*

As a result, PrBP_r^B and PlBP_l^B serve as valid pp-values for multiple-testing procedures and FDR contexts. This is foundational to their direct use in the TOST structure and enables plug-in to standard p-value-based procedures, including the Benjamini–Hochberg algorithm (Ochieng, 25 Jul 2025).

4. Algorithmic Workflow and Implementation

The Bayesian TOST involves the following steps:

  1. Specification: Define equivalence bounds θ1,θ2\theta_1^*,\theta_2^* and the significance level α\alpha.
  2. Prior Selection: Choose prior π(θ)\pi(\theta) over Θ\Theta (e.g., N(θ0,τ2)N(\theta_0, \tau^2) for normal models, Beta for binomial cases).
  3. Posterior Calculation: Compute π(θx)\pi(\theta|x) given observed data xx.
  4. Tail Probability Evaluation:

PrB(x)=θ1π(θx)dθP_r^B(x) = \int_{-\infty}^{\theta_1^*}\pi(\theta|x)d\theta

PlB(x)=θ2π(θx)dθP_l^B(x) = \int_{\theta_2^*}^{\infty}\pi(\theta|x)d\theta

  1. Decision Rule: Declare equivalence if PrB(x)αP_r^B(x) \le \alpha and PlB(x)αP_l^B(x) \le \alpha.

The same steps apply in discrete models (e.g., binomial data), replacing integrals with summations as appropriate (Ochieng, 25 Jul 2025). For explicit “two-interval” Bayesian tests (2IT), one computes posterior probability PE=Pr(θ[θ1,θ2]x)P_E = \Pr(\theta \in [\theta_1^*,\theta_2^*] | x) and applies high threshold criteria (e.g., PE0.95P_E \ge 0.95), paralleling but not identical to the TOST tail-probability approach (Meyer et al., 2021).

5. Power, Conservativeness, and Prior Specification

The power function for the Bayesian TOST depends on prior choice:

  • Binomial-Beta Models: Priors with p,q<1p,q<1 yield less conservative posterior p-values; as p,qp,q increase, conservativeness increases and power drops.
  • Normal Models: For prior N(θ,τ2)N(\theta_*,\tau^2), τ2\tau^2 \to \infty recovers the frequentist case (PbPfP_b \to P_f). Very small τ2\tau^2 gives an overly informative, extremely conservative test. Choosing moderate τ2\tau^2 (by empirical Bayes or elicitation) balances the trade-off.

Closed-form expressions for Pb(x)P_b(x) and Pf(x)P_f(x) allow explicit comparison. For instance, in the normal-mean model with known σ\sigma,

Pb(x)=1Φ(nτ(xˉθ1)σσ2+nτ2)+Φ(nτ(xˉθ2)σσ2+nτ2),P_b(x) = 1-\Phi\left(\frac{n\tau(\bar x-\theta_1^*)}{\sigma\sqrt{\sigma^2+n\tau^2}}\right) +\Phi\left(\frac{n\tau(\bar x-\theta_2^*)}{\sigma\sqrt{\sigma^2+n\tau^2}}\right),

Pf(x)=1{Φ(n(xˉθ1)σ)+Φ(n(xˉθ2)σ)}.P_f(x) = 1-\left\{\Phi\left(\frac{\sqrt n(\bar x-\theta_1^*)}{\sigma}\right) + \Phi\left(\frac{\sqrt n(\bar x-\theta_2^*)}{\sigma}\right)\right\}.

Power (βB(θ)\beta_B(\theta), βF(θ)\beta_F(\theta)) is computed by integrating these under the sampling law. Often, suitably chosen priors achieve greater power near the center of the equivalence region (Ochieng, 25 Jul 2025).

6. Correlation Structure and Multiple Testing

In the normal model, it is shown that Cov{Pb(X),Pf(X)}=0    ρ{Pb,Pf}=0\operatorname{Cov}\{P_b(X),P_f(X)\} = 0 \implies \rho\{P_b,P_f\}=0 (Proposition 8 of (Ochieng, 25 Jul 2025)). This independence ensures separate inferential roles for the Bayesian and frequentist procedures and supports valid FDR procedures.

Simulations for both single-hypothesis and multiple-testing (up to k=1000k=1\,000 hypotheses) demonstrate that

  • Type I error control is near the nominal level,
  • Power increases with sample size, relaxed equivalence margins, and prior variance,
  • Bayesian posterior p-value–based FDR is competitive with standard pp-value FDR as prior variance grows.

Under dependence or in FDR settings, collections of PbP_b values can be submitted directly to algorithms such as Benjamini–Hochberg, as they retain the uniform null distribution (Ochieng, 25 Jul 2025).

7. Extensions, Comparison, and Practical Considerations

The parametric Bayesian TOST generalizes to the Bayesian two-interval test (2IT) framework, which replaces p-values with posterior probabilities of interval hypotheses, applicable to superiority, non-inferiority, or equivalence (Meyer et al., 2021). The Bayesian TOST can realize direct sample size determination via expected posterior probabilities and permits sequential and optional stopping without adjusting type I error, thanks to the likelihood principle.

Differences between Bayesian TOST and fully posterior-interval tests are detailed in their respective treatments: the TOST approach employs tail probabilities as pivotal quantities, matching the frequentist conceptual structure; the 2IT computes posterior mass in the equivalence interval versus its complement and adopts threshold-based decision rules that may directly represent the probability of equivalence.

Across approaches, by tuning the informativeness of the prior, practitioners can interpolate between fully uninformative, classical TOST-like behavior and potentially more powerful, informative analyses when robust prior knowledge is available. The methodology encompasses a wide range of standard parametric models and is computationally straightforward in settings with conjugate priors.

References:

  • "A Comparison of the Bayesian Posterior Probability and the Frequentist pp-Value in Testing Equivalence Hypotheses" (Ochieng, 25 Jul 2025).
  • "Bayesian two-interval test" (Meyer et al., 2021).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Parametric Bayesian TOST.