Nonparametric Bayesian TOST

Updated 10 December 2025

Nonparametric Bayesian TOST is a method that extends equivalence testing by incorporating Bayesian nonparametric models to assess negligible differences.
It leverages flexible priors such as Dirichlet process mixtures and MCMC sampling to allow robust inference without fixed parametric assumptions.
The PROTEST framework operationalizes this approach, ensuring consistency and improved control over type I error in equivalence testing.

Nonparametric Bayesian TOST (Two One-Sided Tests) extends the established methodology of equivalence testing from the parametric to the fully nonparametric Bayesian regime, enabling statistical inference on hypotheses about negligible differences without fixed distributional assumptions. The PROTEST framework operationalizes this extension, providing an accessible, MCMC-based nonparametric approach that parallels the logic of classical TOST procedures by assessing posterior mass within a tolerance region around the null value (Lassance et al., 2024).

1. Conceptual Foundation: Enlarged Null and TOST Analogue

Classical TOST procedures test equivalence via two one-sided tests corresponding to whether a parameter $\theta$ lies outside a given interval around a reference value $\theta_0$ , with width determined by a practical tolerance $\varepsilon$ . Formally, the enlarged or pragmatic null hypothesis is defined as

$H_0^e = \{\theta: |\theta - \theta_0| \le \varepsilon\}$

as opposed to the point null $H_0: \theta = \theta_0$ . The TOST logic requires rejection of both $H_{01}: \theta \le \theta_0 - \varepsilon$ and $H_{02}: \theta \ge \theta_0 + \varepsilon$ ; equivalently, it declares equivalence if the $1-\alpha$ confidence interval falls entirely within $[\theta_0 - \varepsilon, \theta_0 + \varepsilon]$ .

In a Bayesian formulation, the interval-in-CI criterion is replaced with evaluation of the posterior probability that $\theta$ lies within the interval $|\theta - \theta_0| \le \varepsilon$ , specifically computing $P(|\theta - \theta_0|\le\varepsilon|x)$ and declaring equivalence if this posterior mass exceeds $1-\alpha$ . This aligns the Bayesian decision rule directly with the TOST logic (Lassance et al., 2024).

2. Nonparametric Bayesian Model Structure

The nonparametric Bayesian approach instantiates this expanded equivalence logic without parametric restrictions by modeling data distributions through flexible priors such as Dirichlet process mixtures, Pólya tree priors, or Gaussian process priors. In the two-sample setting, suppose $(X_i)_{i=1}^n \sim F_X$ and $(Y_j)_{j=1}^m \sim F_Y$ are two samples:

Likelihood: $X_i|F_X \overset{iid}{\sim} F_X$ , $Y_j|F_Y \overset{iid}{\sim} F_Y$ .
Nonparametric Priors: Common examples include:
- Independent Dirichlet process mixtures:
$G_X \sim DP(\alpha_0, G_0), \quad F_X(x) = \int K(x|\phi) dG_X(\phi)$

with analogous specification for $F_Y$ . - Dependent Dirichlet processes for paired samples, or Gaussian process (GP) priors over densities.
Posterior Sampling: Posterior draws $(F_X^{(i)}, F_Y^{(i)})$ are obtained via standard DP mixture MCMC algorithms (e.g., Chinese-restaurant, Pólya–urn, stick-breaking truncated Gibbs samplers).

3. Posterior Probability of Equivalence

Equivalence for distributions is operationalized via a distance function $d^*(F_X, F_Y)$ . Common choices include:

Kolmogorov–Smirnov-style: $d^*(F_X, F_Y) = \sup_x |F_X(x) - F_Y(x)|$ .
Classifier-based:

$d^*_C(F_X, F_Y) = 0.5\{\Pr_{Z\sim F_X}[f_X(Z)/f_Y(Z) > 1] + \Pr_{Z\sim F_Y}[f_Y(Z)/f_X(Z) > 1]\} - 0.5$

The enlarged null is then

$H_0^e = \{(F_X, F_Y): d^*(F_X, F_Y) \le \varepsilon\}$

For each set of posterior draws, $\Delta_i := d^*(F_X^{(i)}, F_Y^{(i)})$ , the estimated posterior mass in the enlarged null is

$\hat{p} = \frac{1}{N} \sum_{i=1}^N \mathbf{1}\{\Delta_i \le \varepsilon\} \approx \Pr(d^*(F_X, F_Y)\le\varepsilon\,|\,\text{data})$

This operationalizes the Bayesian equivalence assessment fully nonparametrically (Lassance et al., 2024).

4. Decision Criterion and Consistency Properties

The equivalence decision follows directly:

Select a level $\alpha$ .
Declare equivalence if $\hat{p} \ge 1-\alpha$ (i.e., high posterior mass falls within the equivalence region).
Otherwise, withhold equivalence.

An equivalent statement is to reject the enlarged null if $\hat{p}\le\alpha$ . This mirrors the TOST approach’s demand for confidence that the parameter is sufficiently close under the posterior.

PROTEST yields consistency: if the true distributions differ by less than $\varepsilon$ , then for $n,m \to \infty$ , the posterior mass inside $\{d^* \le \varepsilon\}$ converges to one, ensuring the procedure declares equivalence. If the difference exceeds $\varepsilon$ , the posterior mass avoids the $\varepsilon$ -ball and equivalence will not be declared, due to the Bernstein–von Mises phenomenon. In simulation, classical PTtest [Holmes & Walker, 2015] with Pólya–tree priors often over-rejects at large $n$ , even when true differences are negligible, while PROTEST's criterion is stable (Lassance et al., 2024).

5. Selection of Tolerance ε

Determining $\varepsilon$ is central. PROTEST outlines two main strategies:

Direct elicitation:
- Theory or measurement-error bound: $\varepsilon$ set equal to known measurement error $\delta$ .
- Prior-mass calibration: Select a small $\delta$ (e.g., $\delta = \alpha$ ), and pick $\varepsilon$ such that the prior probability of $d^* \leq \varepsilon$ is $\delta$ .
- Reference study: Choose $\varepsilon$ as the smallest value that would have declared equivalence on a key reference dataset at level $\alpha$ .
Sensitivity or bounding:
- Analyze results across candidate tolerances from multiple experts.
- Report posterior mass $\hat{p}_k$ for each $\varepsilon_k$ , and illustrate the $\hat{p}(\varepsilon)$ boundary to contextualize robustness with respect to the choice of tolerance.

6. Implementation Workflow

An explicit workflow for the two-sample PROTEST test is as follows:

Step	Description	Operational Detail
1	Run MCMC to generate $\{F_X^{(i)}, F_Y^{(i)}\}$	Use DP mixture or other NP prior
2	Compute $\Delta_i = d^*(F_X^{(i)}, F_Y^{(i)})$	Choice of $d^*$ as specified
3	Posterior mass $\hat{p} = (1/N) \sum_{i=1}^N 1\{\Delta_i \leq \varepsilon\}$	Empirical proportion
4	Declare equivalence if $\hat{p} \geq 1-\alpha$	Output: $\hat{p}$ , equivalence result

In practice, standard DP mixture samplers are employed, as implemented in tools such as the R package protest (GitHub: rflassance/protest) (Lassance et al., 2024).

7. Comparison with Existing Nonparametric Approaches

Holmes & Walker's PTtest computes a tail-area metric using a Pólya–tree prior but bases decisions on a classical $\varepsilon$ -ball; this method tends to reject equivalence as sample size grows, regardless of practical difference. In contrast, PROTEST employs a posterior-mass-in-interval criterion, remaining insensitive to overfitting of the posterior and better aligned with pragmatic thresholds. Empirical studies, including a normal vs. $t_{30}$ simulated example, illustrate that PTtest frequently over-rejects at high sample size, while PROTEST's behavior more closely matches practical equivalence constructs (Lassance et al., 2024).

A plausible implication is that PROTEST constitutes an automated and coherent Bayesian analogue of classical TOST in the nonparametric regime, with practical advantages in interpretability and robustness.

For a comprehensive presentation and additional illustrations, see "PROTEST: Nonparametric Testing of Hypotheses Enhanced by Experts' Utility Judgements" (Lassance et al., 2024).

Markdown Upgrade to Chat

References (1)

PROTEST: Nonparametric Testing of Hypotheses Enhanced by Experts' Utility Judgements (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Nonparametric Bayesian TOST.

Nonparametric Bayesian TOST

1. Conceptual Foundation: Enlarged Null and TOST Analogue

2. Nonparametric Bayesian Model Structure

3. Posterior Probability of Equivalence

4. Decision Criterion and Consistency Properties

5. Selection of Tolerance ε

6. Implementation Workflow

7. Comparison with Existing Nonparametric Approaches

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Nonparametric Bayesian TOST

1. Conceptual Foundation: Enlarged Null and TOST Analogue

2. Nonparametric Bayesian Model Structure

3. Posterior Probability of Equivalence

4. Decision Criterion and Consistency Properties

5. Selection of Tolerance ε

6. Implementation Workflow

7. Comparison with Existing Nonparametric Approaches

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research