Weighted Gap-Intersection Procedure

Updated 17 November 2025

The procedure integrates prior weights via a weighted log-likelihood ratio, adjusting evidence thresholds to privilege hypotheses based on external information.
It uses gap and intersection rules to define stopping times, ensuring rigorous family-wise error control while meeting signal count constraints.
The method achieves first-order asymptotic optimality and shows robust performance in high-dimensional and random-weight settings, outperforming unweighted approaches.

The Weighted Gap-Intersection Procedure is a sequential multiple testing algorithm designed to incorporate prior weights into each hypothesis stream, offering both strong control of the family-wise error rate (FWE) and first-order asymptotic optimality in expected stopping time. The approach formalizes the use of a weighted log-likelihood ratio (WLLR), generalizes classical sequential testing boundaries to exploit both order and magnitude gaps, and achieves robust performance even in high-dimensional and random-weight regimes. This procedure allows efficient hypothesis selection when only broad signal-count bounds are known, extending previous gap and intersection methodologies. It stands out in information-theoretic efficiency, explicit error control, and practical scalability.

1. Weighted Log-Likelihood Ratio (WLLR)

For each hypothesis index $j=1,\dots,J$ , the observed data stream $X_i^j$ follows either the null law $P_0^j$ or alternative $P_1^j$ . The standard log-likelihood ratio is

$\lambda^j(n) = \log \frac{dP_{1,n}^j}{dP_{0,n}^j}$

where $P_{i,n}^j$ is the restriction of $P_i^j$ to data up to time $n$ .

To encode prior knowledge or importance, positive weights $W_1,\dots,W_J$ are assigned a priori. The weighted log-likelihood ratio modifies the evidence process as

$\lambda_W^j(n) = \lambda^j(n) + \ln(W_j).$

This additive "head-start" shifts the boundaries for each stream, allowing the procedure to privilege hypotheses according to external information.

2. Formal Stopping and Decision Rules

The true signal set $A \subseteq \{1,\dots,J\}$ is only known to satisfy $|A| \in [l, u]$ for integers $0 \le l \le u \le J$ . At time $n$ , WLLRs are ordered

$\lambda_W^{(1)}(n) \ge \lambda_W^{(2)}(n) \ge \cdots \ge \lambda_W^{(J)}(n)$

with $\lambda_W^{(0)}(n)=+\infty$ , $\lambda_W^{(J+1)}(n)=-\infty$ .

Define the number of positive WLLRs: $p_W(n) = \#\{ j : \lambda_W^j(n) > 0 \}$ Let $a,b,c,d > 0$ be fixed thresholds chosen for FWE control.

The composite stopping time is

$T_{W,GI} = \min\{ \tau_{1,W},\ \tau_{2,W},\ \tau_{3,W} \}$

with three boundary definitions:

Intersection Rule:

$\tau_{2,W} = \inf\Big\{ n \ge 1 : l \le p_W(n) \le u,\; \lambda_W^j(n) \notin (-a, b)\ \forall j \Big\}$

Lower-Boundary Gap Rule ( $|A|=l$ ):

$\tau_{1,W} = \inf\Big\{ n \ge 1 : \lambda_W^{(l+1)}(n) \le -a,\; \lambda_W^{(l)}(n) - \lambda_W^{(l+1)}(n) \ge c \Big\}$

Upper-Boundary Gap Rule ( $|A|=u$ ):

$\tau_{3,W} = \inf\Big\{ n \ge 1 : \lambda_W^{(u)}(n) \ge b,\; \lambda_W^{(u)}(n) - \lambda_W^{(u+1)}(n) \ge d \Big\}$

At $T_{W,GI}$ , the decision set is

$D_{W,GI} = \{ j : \lambda_W^j(T_{W,GI}) > 0 \}\quad \text{truncated to } |D_{W,GI}| \in [l,u].$

This selection can involve adding indices with highest WLLR or removing lowest to meet signal count bounds.

3. Implementation Pseudocode

High-Level Outline:

Initialize n ← 1
Repeat:
    For each j: λ_W^j(n) ← λ_W^j(n-1) + ln f_1^j(X_n^j) / f_0^j(X_n^j)
    Sort λ_W^j(n) descending; calculate p_W(n)
    If stopping condition τ_{1,W} or τ_{2,W} or τ_{3,W} holds:
        Set T_{W,GI} ← n
        Break
    Else:
        n ← n + 1
After exit:
    D ← {j : λ_W^j(T_{W,GI}) > 0}
    If |D| < l: add top (l - |D|) indices
    If |D| > u: remove indices with smallest WLLRs to truncate to u

This implementation avoids explicit enumeration of hypotheses at each step and remains efficient for moderate to large

J

4. Family-Wise Error Rate (FWE) Control

Proposition: To achieve

$\sup_{A:|A| \in [l,u]} P_A(D_{W,GI} \setminus A \neq \emptyset) \le \alpha,\quad \sup_{A} P_A(A \setminus D_{W,GI} \neq \emptyset) \le \beta$

it is sufficient to set thresholds as

$\begin{aligned} b &\ge |\ln(\alpha/2)| + \ln\Bigl(\max_{A:|A| \in [l,u]} \sum_{j \in A^c} W_j \Bigr) \ a &\ge |\ln(\beta/2)| + \ln\Bigl(\max_{A:|A| \in [l,u]} \sum_{k \in A} W_k^{-1} \Bigr) \ c &\ge |\ln(\alpha/2)| + \ln\,\mathcal C_W(l,J) \ d &\ge |\ln(\beta/2)| + \ln\,\mathcal C_W(u,J) \end{aligned}$

with

$\mathcal C_W(m,J) = \max_{A:|A|=m} \Bigl(\sum_{j \in A^c} W_j\Bigr) \Bigl(\sum_{k \in A} W_k^{-1}\Bigr)$

This approach utilizes exponential tail bounds derived via Wald’s change-of-measure and union bounding over possible false inclusions or exclusions. The resulting error control extends to all signal-count compatible alternatives.

5. Asymptotic Optimality Theory

Define per-hypothesis information rates: $I_1^j = \mathbb{E}_1^j[\lambda^j(1)],\quad I_0^j = -\mathbb{E}_0^j[\lambda^j(1)]$ For $A \subseteq \{1,\dots,J\}$ ,

$\eta_1^A = \min_{k \in A} I_1^k,\quad \eta_0^A = \min_{j \in A^c} I_0^j$

The Song–Fellouris lower bound (Bose et al., 10 Nov 2025) on expected stopping time for any procedure in $\Delta_{\alpha,\beta,l,u}$ is

$L_A(\alpha,\beta;l,u) = \begin{cases} \max\left\{ \frac{|\ln\beta|}{\eta_0^A}, \frac{|\ln\alpha|}{\eta_1^A+\eta_0^A} \right\}, & |A|=l \ \max\left\{ \frac{|\ln\beta|}{\eta_0^A}, \frac{|\ln\alpha|}{\eta_1^A} \right\}, & l < |A| < u \ \max\left\{ \frac{|\ln\alpha|}{\eta_1^A}, \frac{|\ln\beta|}{\eta_0^A+\eta_1^A} \right\}, & |A|=u \end{cases}$

The Weighted Gap–Intersection procedure achieves first-order optimality: $\lim_{\alpha,\beta \to 0} \frac{\mathbb{E}_A[T_{W,GI}]}{L_A(\alpha,\beta;l,u)} = 1.$ The proof utilizes reduction to a collection of independent random walks crossing boundaries and matches the lower bound up to vanishing second-order terms.

6. High-Dimensional and Random-Weights Analysis

(a) Fixed weights, large $J$ : The following scaling conditions

$J = o(\kappa^{1/4}),\quad \max_j \ln W_j = o(\kappa),\quad -\min_j \ln W_j = o(\kappa),\quad \kappa = |\ln(\alpha \wedge \beta)| \to \infty$

ensure first-order optimality. If all $W_j$ are uniformly bounded, only $J = o(\kappa^{1/4})$ is required.

(b) Random weights: For weights $W \sim$ known law (drawn once before sampling), adaptive thresholding (e.g., $c(W) = |\ln \alpha| + \ln \mathcal C_W(l,J)$ ) guarantees conditional FWE control. Unconditional expected sample size remains optimal if

$J = o(\kappa^{1/4}),\quad \mathbb{E}[\ln \max_j W_j] = o(\kappa),\quad \mathbb{E}[-\ln \min_j W_j] = o(\kappa)$

Admissible laws include bounded, binary, log-normal, and Pareto weights provided extreme statistics grow only poly-logarithmically in $J$ .

7. Simulation Study and Empirical Behavior

Simulations conducted under the Gaussian means model $X_i^j \sim N(\mu_j, 1)$ with $\mu_j \in \{0, 0.15\}$ target error rates $\alpha = \beta = 0.05$ and signal counts to hypotheses ratios typical for high-dimensional inference ( $J = 200, 300, 400$ , $m/J = 0.1$ ).

Four weight scenarios were tested:

Unweighted: $W_j \equiv 1$
Informative: true signals receive $W = r \gg 1$ with probability favoring correct identification
Misinformative: weights anti-correlated to true signals
Noisy: weights independent, mean-1 with controlled variance

Empirical effective sample size (ESS) is lowest when weights are informative, outperforming the unweighted baseline. When weights are noisy or misleading, ESS exceeds baseline, consistent with the theoretical second-order term. This suggests the practical robustness of the procedure, but also cautions against poorly chosen weights.

Summary Table: Procedural Features

Property	Weighted Gap-Intersection	Classic Sequential (Unweighted)
Prior weights incorporated	Yes	No
Signal count bounds allowed	$l \le \|A\| \le u$	Often fixed or unconstrained
Error control (FWE)	Explicit (via thresholds)	Varies; often only Type I
High-dimensional scalability	$J=o(\kappa^{1/4})$	Typically limited
ESS matches lower bound	Yes (first-order)	Yes for gap/intersection variants

A plausible implication is that weighted procedures are especially advantageous when reliable external information about hypothesis relevance is available. In summary, the Weighted Gap-Intersection procedure generalizes multiple sequential testing to exploit prior-weighted evidence, maintains rigorous error guarantees, and achieves information-theoretic optimality in broad operational regimes (Bose et al., 10 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Weighted Asymptotically Optimal Sequential Testing (2025)

Follow Topic

Get notified by email when new papers are published related to Weighted Gap-Intersection Procedure.