Sequential Bayes Factor Designs

Updated 13 January 2026

Sequential Bayes Factor Designs are statistical frameworks that use Bayes factors for adaptive decision-making in sequential studies, allowing early stopping for efficacy or futility.
They employ analytical and numerical methods, including closed-form beta functions and MVN integration, to calibrate stopping boundaries while controlling error rates.
These designs reduce computational demands by eliminating intensive Monte Carlo simulations and are versatile for various endpoints and multi-stage trials.

Sequential Bayes Factor Designs are statistical frameworks that incorporate Bayes factors as the primary evidence metric for decision-making throughout the conduct of clinical trials, animal experiments, and other types of sequential studies. These designs facilitate early stopping for efficacy or futility and exploit closed-form or efficient numerical approaches to avoid computationally intensive Monte Carlo simulations. Structured to maintain operating characteristics such as Type I error and power, Sequential Bayes Factor Designs can be specialized (e.g., two-stage, group sequential) and generalized to various endpoints and multi-stage settings (Kelter et al., 28 Nov 2025, Pawel et al., 6 Jan 2026).

1. Bayes Factors and Sequential Hypothesis Testing

The Bayes factor (BF) is the ratio of marginal likelihoods for two competing hypotheses, $H_0$ and $H_1$ :

$\mathrm{BF}_{01} = \frac{p(D \mid H_0)}{p(D \mid H_1)}$

BFs update prior odds into posterior odds after observing data $D$ . In sequential settings, BF thresholds are implemented to make stop/go decisions at pre-planned interims. For normal outcome z-tests, BFs can be expressed as functions of the summary z-statistic:

For point-null vs. point-alternative:

$\mathrm{BF}_{01} = \exp\left(\frac{\mu^2}{2\sigma^2} - \frac{z\mu}{\sigma}\right)$

Efficient computation and mapping of BF thresholds to z-thresholds allow the specification of stopping rules without simulation (Pawel et al., 6 Jan 2026).

2. Two-Stage and Group Sequential Design Schemes

Sequential Bayes Factor Designs may be instantiated as:

Two-stage designs: A single interim analysis leads to either early stopping for futility, evidence for efficacy, or continuation to final analysis. The design employs trinomial tree branching, with three paths following the interim: "efficacy," "indecisive," and "futility." Only "indecisive" branches proceed to the second stage. Design calibration and error correction are performed by summing prior-predictive probabilities across branches (Kelter et al., 28 Nov 2025).
Group-sequential designs: Multiple interim analyses at cumulative sample sizes $n_1 < \dots < n_K$ . Stopping boundaries are derived by mapping BF thresholds to critical values of cumulative z-statistics. Operating characteristics—including probability of stopping and expected sample size—can be calculated via multivariate normal integration across joint distributions of $(Z_1, ..., Z_K)$ (Pawel et al., 6 Jan 2026).

3. Calibration of Operating Characteristics

Calibration involves selecting the sample sizes and BF thresholds such that prespecified constraints on Type I error and power are met. For two-stage trials:

Type I error under $H_0$ : $\hat\alpha(n_1, n_2) = P_{H_0}[\mathrm{BF}_{01}^{(n_2)} < k\,\text{AND\,no\,futility\,stop}] \leq \alpha$
Power under $H_1$ : $1-\hat\beta(n_1, n_2) = P_{H_1}[\mathrm{BF}_{01}^{(n_2)} < k\,\text{AND\,no\,futility\,stop}] \geq 1-\beta$

Adjustment for interim stop is performed by subtracting pruned path probability masses from the unconditional probabilities (see Theorem 1 in (Kelter et al., 28 Nov 2025)). In group-sequential designs, stopping probabilities and expected sample size are computed by integrating over the multivariate normal density defined by the information schedule.

A design search is implemented by grid search and root-finding over $(n_1, n_2)$ ranges and threshold parameters—these calculations leverage closed-form beta functions for binomial endpoints and MVN integration for normally distributed or t-test outcomes (Kelter et al., 28 Nov 2025, Pawel et al., 6 Jan 2026).

4. Analytical and Computational Formulations

All necessary binomial–beta integrals for two-stage design reduce to sums of beta functions or regularized incomplete beta functions. For group-sequential designs, BF boundaries are mapped to critical z-values, and stopping regions are characterized in z-space:

Expected sample size under hypothesis $H_i$ :

$E[N \mid H_i] = n_1\,P_{H_i}[\text{stop at stage 1}] + n_2\,P_{H_i}[\text{continue}]$

Power and error rates in group-sequential designs via summing MVN-integrated probabilities over all stopping sets $S_k^1$ (for $H_1$ ) and $S_k^0$ (for $H_0$ ).

Implementation does not require Monte Carlo sampling; all design characteristics can be obtained with standard root-finding and sum/integral evaluations, feasible for $n_2 \leq 200$ in two-stage cases and up to dozens of interims with MVN integration (Kelter et al., 28 Nov 2025, Pawel et al., 6 Jan 2026).

Design Aspect	Two-Stage (Trinomial Tree)	Group Sequential (MVN)
Stopping boundaries	BF thresholds at interim/final	z-boundaries mapped from BFs
Error/power calibration	Closed-form sums and pruning	Multivariate normal integration
Computing expectations	Beta/incomplete beta functions	MVN integration routines

5. Generalizations and Special Cases

Simon’s optimal two-stage design is recovered as a special case when flat point-priors are used for binary endpoints and specific BF thresholds are chosen (e.g., $k = 1/3$ , $k_f = 3$ ) (Kelter et al., 28 Nov 2025).
Trinomial branching and MVN integration approaches generalize to settings beyond binary endpoints, including Poisson counts, normal means, and time-to-event data via piecewise exponential models. The formalism is endpoint-agnostic, with analytic formulas replaced by corresponding likelihood/prior mixtures.
Designs may be extended to $K$ stages, with interim analyses handled via $(K+1)$ -nomial tree pruning and cumulative computation of efficacy/futility probabilities. For higher $K$ , recursive or numerical summation is required (Kelter et al., 28 Nov 2025).

6. Practical Implementation and Recommendations

Implementation notes stress the triviality of sum/integration routines for typical design sizes, rapid root-finding for threshold calibration, and monotonicity checks to ensure valid operating characteristics across possible interim boundary choices. The R package bfpwr provides functions for setting up and evaluating sequential BF designs, including printing stage-wise probabilities, $E[N]$ , $\operatorname{Var}(N)$ , and characteristic curves (Pawel et al., 6 Jan 2026).

Guidelines for practitioners include:

BF thresholds $(k_0 = 10, k_1 = 1/10)$ represent "strong" evidence, with alternatives using less or more stringent criteria as warranted.
Two or three interim looks capture most efficiency benefits.
Design priors should reflect genuine uncertainty for realistic average behavior.
Interpretation: final $\mathrm{BF}_{01} < 1$ supports $H_1$ , $\mathrm{BF}_{01} > 1$ supports $H_0$ ; magnitudes quantify evidential strength (Pawel et al., 6 Jan 2026).

7. Contextual Applications and Illustrative Examples

Applications span clinical trials (binary endpoints, group sequential), animal experiments (sequential mean comparisons), and psychological studies (Bayesian t-tests with Cauchy priors). Specific instances evidence that simulation-free calibration via analytical or MVN approaches agree with traditional Monte Carlo outcomes but are computationally more efficient:

Clinical trial with log-odds-ratio $\theta = \ln 3 \approx 1.10$ : three looks $(n_k = 25, 50, 75)$ , thresholds $(k_1 = 1/10, k_0 = 10)$ yield $E[N] \approx 60$ , $\Pr$ (efficacy) $\approx 82\%$ under alternative (Pawel et al., 6 Jan 2026).
Animal experiment: efficient early stopping saves resources, with trial trajectories handled under replication priors.
Bayesian t-test: design with JZS prior and $61$ looks up to $n = 100$ per arm; $E[N] \approx 69$ , $\Pr$ (efficacy) $\approx 70\%$ .

A plausible implication is that such designs enable rapid, simulation-free exploration, optimization, and reporting in domains requiring principled sequential evidence quantification. The generality and analytic tractability suggest broad applicability to experimental and clinical research requiring early decisiveness and controlled operating characteristics (Kelter et al., 28 Nov 2025, Pawel et al., 6 Jan 2026).

PDF Markdown Chat (Pro)

References (2)

The Bayesian optimal two-stage design for clinical phase II trials based on Bayes factors (2025)

Bayes Factor Group Sequential Designs (2026)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Sequential Bayes Factor Designs.