Papers
Topics
Authors
Recent
2000 character limit reached

Nonparametric Bayesian TOST

Updated 10 December 2025
  • Nonparametric Bayesian TOST is a method that extends equivalence testing by incorporating Bayesian nonparametric models to assess negligible differences.
  • It leverages flexible priors such as Dirichlet process mixtures and MCMC sampling to allow robust inference without fixed parametric assumptions.
  • The PROTEST framework operationalizes this approach, ensuring consistency and improved control over type I error in equivalence testing.

Nonparametric Bayesian TOST (Two One-Sided Tests) extends the established methodology of equivalence testing from the parametric to the fully nonparametric Bayesian regime, enabling statistical inference on hypotheses about negligible differences without fixed distributional assumptions. The PROTEST framework operationalizes this extension, providing an accessible, MCMC-based nonparametric approach that parallels the logic of classical TOST procedures by assessing posterior mass within a tolerance region around the null value (Lassance et al., 8 Mar 2024).

1. Conceptual Foundation: Enlarged Null and TOST Analogue

Classical TOST procedures test equivalence via two one-sided tests corresponding to whether a parameter θ\theta lies outside a given interval around a reference value θ0\theta_0, with width determined by a practical tolerance ε\varepsilon. Formally, the enlarged or pragmatic null hypothesis is defined as

H0e={θ:θθ0ε}H_0^e = \{\theta: |\theta - \theta_0| \le \varepsilon\}

as opposed to the point null H0:θ=θ0H_0: \theta = \theta_0. The TOST logic requires rejection of both H01:θθ0εH_{01}: \theta \le \theta_0 - \varepsilon and H02:θθ0+εH_{02}: \theta \ge \theta_0 + \varepsilon; equivalently, it declares equivalence if the 1α1-\alpha confidence interval falls entirely within [θ0ε,θ0+ε][\theta_0 - \varepsilon, \theta_0 + \varepsilon].

In a Bayesian formulation, the interval-in-CI criterion is replaced with evaluation of the posterior probability that θ\theta lies within the interval θθ0ε|\theta - \theta_0| \le \varepsilon, specifically computing P(θθ0εx)P(|\theta - \theta_0|\le\varepsilon|x) and declaring equivalence if this posterior mass exceeds 1α1-\alpha. This aligns the Bayesian decision rule directly with the TOST logic (Lassance et al., 8 Mar 2024).

2. Nonparametric Bayesian Model Structure

The nonparametric Bayesian approach instantiates this expanded equivalence logic without parametric restrictions by modeling data distributions through flexible priors such as Dirichlet process mixtures, Pólya tree priors, or Gaussian process priors. In the two-sample setting, suppose (Xi)i=1nFX(X_i)_{i=1}^n \sim F_X and (Yj)j=1mFY(Y_j)_{j=1}^m \sim F_Y are two samples:

  • Likelihood: XiFXiidFXX_i|F_X \overset{iid}{\sim} F_X, YjFYiidFYY_j|F_Y \overset{iid}{\sim} F_Y.
  • Nonparametric Priors: Common examples include:
    • Independent Dirichlet process mixtures:

    GXDP(α0,G0),FX(x)=K(xϕ)dGX(ϕ)G_X \sim DP(\alpha_0, G_0), \quad F_X(x) = \int K(x|\phi) dG_X(\phi)

    with analogous specification for FYF_Y. - Dependent Dirichlet processes for paired samples, or Gaussian process (GP) priors over densities.

  • Posterior Sampling: Posterior draws (FX(i),FY(i))(F_X^{(i)}, F_Y^{(i)}) are obtained via standard DP mixture MCMC algorithms (e.g., Chinese-restaurant, Pólya–urn, stick-breaking truncated Gibbs samplers).

3. Posterior Probability of Equivalence

Equivalence for distributions is operationalized via a distance function d(FX,FY)d^*(F_X, F_Y). Common choices include:

  • Kolmogorov–Smirnov-style: d(FX,FY)=supxFX(x)FY(x)d^*(F_X, F_Y) = \sup_x |F_X(x) - F_Y(x)|.

  • Classifier-based:

dC(FX,FY)=0.5{PrZFX[fX(Z)/fY(Z)>1]+PrZFY[fY(Z)/fX(Z)>1]}0.5d^*_C(F_X, F_Y) = 0.5\{\Pr_{Z\sim F_X}[f_X(Z)/f_Y(Z) > 1] + \Pr_{Z\sim F_Y}[f_Y(Z)/f_X(Z) > 1]\} - 0.5

The enlarged null is then

H0e={(FX,FY):d(FX,FY)ε}H_0^e = \{(F_X, F_Y): d^*(F_X, F_Y) \le \varepsilon\}

For each set of posterior draws, Δi:=d(FX(i),FY(i))\Delta_i := d^*(F_X^{(i)}, F_Y^{(i)}), the estimated posterior mass in the enlarged null is

p^=1Ni=1N1{Δiε}Pr(d(FX,FY)εdata)\hat{p} = \frac{1}{N} \sum_{i=1}^N \mathbf{1}\{\Delta_i \le \varepsilon\} \approx \Pr(d^*(F_X, F_Y)\le\varepsilon\,|\,\text{data})

This operationalizes the Bayesian equivalence assessment fully nonparametrically (Lassance et al., 8 Mar 2024).

4. Decision Criterion and Consistency Properties

The equivalence decision follows directly:

  • Select a level α\alpha.

  • Declare equivalence if p^1α\hat{p} \ge 1-\alpha (i.e., high posterior mass falls within the equivalence region).

  • Otherwise, withhold equivalence.

An equivalent statement is to reject the enlarged null if p^α\hat{p}\le\alpha. This mirrors the TOST approach’s demand for confidence that the parameter is sufficiently close under the posterior.

PROTEST yields consistency: if the true distributions differ by less than ε\varepsilon, then for n,mn,m \to \infty, the posterior mass inside {dε}\{d^* \le \varepsilon\} converges to one, ensuring the procedure declares equivalence. If the difference exceeds ε\varepsilon, the posterior mass avoids the ε\varepsilon-ball and equivalence will not be declared, due to the Bernstein–von Mises phenomenon. In simulation, classical PTtest [Holmes & Walker, 2015] with Pólya–tree priors often over-rejects at large nn, even when true differences are negligible, while PROTEST's criterion is stable (Lassance et al., 8 Mar 2024).

5. Selection of Tolerance ε

Determining ε\varepsilon is central. PROTEST outlines two main strategies:

  • Direct elicitation:

    • Theory or measurement-error bound: ε\varepsilon set equal to known measurement error δ\delta.
    • Prior-mass calibration: Select a small δ\delta (e.g., δ=α\delta = \alpha), and pick ε\varepsilon such that the prior probability of dεd^* \leq \varepsilon is δ\delta.
    • Reference study: Choose ε\varepsilon as the smallest value that would have declared equivalence on a key reference dataset at level α\alpha.
  • Sensitivity or bounding:
    • Analyze results across candidate tolerances from multiple experts.
    • Report posterior mass p^k\hat{p}_k for each εk\varepsilon_k, and illustrate the p^(ε)\hat{p}(\varepsilon) boundary to contextualize robustness with respect to the choice of tolerance.

6. Implementation Workflow

An explicit workflow for the two-sample PROTEST test is as follows:

Step Description Operational Detail
1 Run MCMC to generate {FX(i),FY(i)}\{F_X^{(i)}, F_Y^{(i)}\} Use DP mixture or other NP prior
2 Compute Δi=d(FX(i),FY(i))\Delta_i = d^*(F_X^{(i)}, F_Y^{(i)}) Choice of dd^* as specified
3 Posterior mass p^=(1/N)i=1N1{Δiε}\hat{p} = (1/N) \sum_{i=1}^N 1\{\Delta_i \leq \varepsilon\} Empirical proportion
4 Declare equivalence if p^1α\hat{p} \geq 1-\alpha Output: p^\hat{p}, equivalence result

In practice, standard DP mixture samplers are employed, as implemented in tools such as the R package protest (GitHub: rflassance/protest) (Lassance et al., 8 Mar 2024).

7. Comparison with Existing Nonparametric Approaches

Holmes & Walker's PTtest computes a tail-area metric using a Pólya–tree prior but bases decisions on a classical ε\varepsilon-ball; this method tends to reject equivalence as sample size grows, regardless of practical difference. In contrast, PROTEST employs a posterior-mass-in-interval criterion, remaining insensitive to overfitting of the posterior and better aligned with pragmatic thresholds. Empirical studies, including a normal vs. t30t_{30} simulated example, illustrate that PTtest frequently over-rejects at high sample size, while PROTEST's behavior more closely matches practical equivalence constructs (Lassance et al., 8 Mar 2024).

A plausible implication is that PROTEST constitutes an automated and coherent Bayesian analogue of classical TOST in the nonparametric regime, with practical advantages in interpretability and robustness.


For a comprehensive presentation and additional illustrations, see "PROTEST: Nonparametric Testing of Hypotheses Enhanced by Experts' Utility Judgements" (Lassance et al., 8 Mar 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Nonparametric Bayesian TOST.