Papers
Topics
Authors
Recent
2000 character limit reached

Weighted Gap-Intersection Procedure

Updated 17 November 2025
  • The procedure integrates prior weights via a weighted log-likelihood ratio, adjusting evidence thresholds to privilege hypotheses based on external information.
  • It uses gap and intersection rules to define stopping times, ensuring rigorous family-wise error control while meeting signal count constraints.
  • The method achieves first-order asymptotic optimality and shows robust performance in high-dimensional and random-weight settings, outperforming unweighted approaches.

The Weighted Gap-Intersection Procedure is a sequential multiple testing algorithm designed to incorporate prior weights into each hypothesis stream, offering both strong control of the family-wise error rate (FWE) and first-order asymptotic optimality in expected stopping time. The approach formalizes the use of a weighted log-likelihood ratio (WLLR), generalizes classical sequential testing boundaries to exploit both order and magnitude gaps, and achieves robust performance even in high-dimensional and random-weight regimes. This procedure allows efficient hypothesis selection when only broad signal-count bounds are known, extending previous gap and intersection methodologies. It stands out in information-theoretic efficiency, explicit error control, and practical scalability.

1. Weighted Log-Likelihood Ratio (WLLR)

For each hypothesis index j=1,,Jj=1,\dots,J, the observed data stream XijX_i^j follows either the null law P0jP_0^j or alternative P1jP_1^j. The standard log-likelihood ratio is

λj(n)=logdP1,njdP0,nj\lambda^j(n) = \log \frac{dP_{1,n}^j}{dP_{0,n}^j}

where Pi,njP_{i,n}^j is the restriction of PijP_i^j to data up to time nn.

To encode prior knowledge or importance, positive weights W1,,WJW_1,\dots,W_J are assigned a priori. The weighted log-likelihood ratio modifies the evidence process as

λWj(n)=λj(n)+ln(Wj).\lambda_W^j(n) = \lambda^j(n) + \ln(W_j).

This additive "head-start" shifts the boundaries for each stream, allowing the procedure to privilege hypotheses according to external information.

2. Formal Stopping and Decision Rules

The true signal set A{1,,J}A \subseteq \{1,\dots,J\} is only known to satisfy A[l,u]|A| \in [l, u] for integers 0luJ0 \le l \le u \le J. At time nn, WLLRs are ordered

λW(1)(n)λW(2)(n)λW(J)(n)\lambda_W^{(1)}(n) \ge \lambda_W^{(2)}(n) \ge \cdots \ge \lambda_W^{(J)}(n)

with λW(0)(n)=+\lambda_W^{(0)}(n)=+\infty, λW(J+1)(n)=\lambda_W^{(J+1)}(n)=-\infty.

Define the number of positive WLLRs: pW(n)=#{j:λWj(n)>0}p_W(n) = \#\{ j : \lambda_W^j(n) > 0 \} Let a,b,c,d>0a,b,c,d > 0 be fixed thresholds chosen for FWE control.

The composite stopping time is

TW,GI=min{τ1,W, τ2,W, τ3,W}T_{W,GI} = \min\{ \tau_{1,W},\ \tau_{2,W},\ \tau_{3,W} \}

with three boundary definitions:

  • Intersection Rule:

τ2,W=inf{n1:lpW(n)u,  λWj(n)(a,b) j}\tau_{2,W} = \inf\Big\{ n \ge 1 : l \le p_W(n) \le u,\; \lambda_W^j(n) \notin (-a, b)\ \forall j \Big\}

  • Lower-Boundary Gap Rule (A=l|A|=l):

τ1,W=inf{n1:λW(l+1)(n)a,  λW(l)(n)λW(l+1)(n)c}\tau_{1,W} = \inf\Big\{ n \ge 1 : \lambda_W^{(l+1)}(n) \le -a,\; \lambda_W^{(l)}(n) - \lambda_W^{(l+1)}(n) \ge c \Big\}

  • Upper-Boundary Gap Rule (A=u|A|=u):

τ3,W=inf{n1:λW(u)(n)b,  λW(u)(n)λW(u+1)(n)d}\tau_{3,W} = \inf\Big\{ n \ge 1 : \lambda_W^{(u)}(n) \ge b,\; \lambda_W^{(u)}(n) - \lambda_W^{(u+1)}(n) \ge d \Big\}

At TW,GIT_{W,GI}, the decision set is

DW,GI={j:λWj(TW,GI)>0}truncated to DW,GI[l,u].D_{W,GI} = \{ j : \lambda_W^j(T_{W,GI}) > 0 \}\quad \text{truncated to } |D_{W,GI}| \in [l,u].

This selection can involve adding indices with highest WLLR or removing lowest to meet signal count bounds.

3. Implementation Pseudocode

High-Level Outline:

1
2
3
4
5
6
7
8
9
10
11
12
13
Initialize n ← 1
Repeat:
    For each j: λ_W^j(n) ← λ_W^j(n-1) + ln f_1^j(X_n^j) / f_0^j(X_n^j)
    Sort λ_W^j(n) descending; calculate p_W(n)
    If stopping condition τ_{1,W} or τ_{2,W} or τ_{3,W} holds:
        Set T_{W,GI} ← n
        Break
    Else:
        n ← n + 1
After exit:
    D ← {j : λ_W^j(T_{W,GI}) > 0}
    If |D| < l: add top (l - |D|) indices
    If |D| > u: remove indices with smallest WLLRs to truncate to u
This implementation avoids explicit enumeration of hypotheses at each step and remains efficient for moderate to large JJ.

4. Family-Wise Error Rate (FWE) Control

Proposition: To achieve

supA:A[l,u]PA(DW,GIA)α,supAPA(ADW,GI)β\sup_{A:|A| \in [l,u]} P_A(D_{W,GI} \setminus A \neq \emptyset) \le \alpha,\quad \sup_{A} P_A(A \setminus D_{W,GI} \neq \emptyset) \le \beta

it is sufficient to set thresholds as

bln(α/2)+ln(maxA:A[l,u]jAcWj) aln(β/2)+ln(maxA:A[l,u]kAWk1) cln(α/2)+lnCW(l,J) dln(β/2)+lnCW(u,J)\begin{aligned} b &\ge |\ln(\alpha/2)| + \ln\Bigl(\max_{A:|A| \in [l,u]} \sum_{j \in A^c} W_j \Bigr) \ a &\ge |\ln(\beta/2)| + \ln\Bigl(\max_{A:|A| \in [l,u]} \sum_{k \in A} W_k^{-1} \Bigr) \ c &\ge |\ln(\alpha/2)| + \ln\,\mathcal C_W(l,J) \ d &\ge |\ln(\beta/2)| + \ln\,\mathcal C_W(u,J) \end{aligned}

with

CW(m,J)=maxA:A=m(jAcWj)(kAWk1)\mathcal C_W(m,J) = \max_{A:|A|=m} \Bigl(\sum_{j \in A^c} W_j\Bigr) \Bigl(\sum_{k \in A} W_k^{-1}\Bigr)

This approach utilizes exponential tail bounds derived via Wald’s change-of-measure and union bounding over possible false inclusions or exclusions. The resulting error control extends to all signal-count compatible alternatives.

5. Asymptotic Optimality Theory

Define per-hypothesis information rates: I1j=E1j[λj(1)],I0j=E0j[λj(1)]I_1^j = \mathbb{E}_1^j[\lambda^j(1)],\quad I_0^j = -\mathbb{E}_0^j[\lambda^j(1)] For A{1,,J}A \subseteq \{1,\dots,J\},

η1A=minkAI1k,η0A=minjAcI0j\eta_1^A = \min_{k \in A} I_1^k,\quad \eta_0^A = \min_{j \in A^c} I_0^j

The Song–Fellouris lower bound (Bose et al., 10 Nov 2025) on expected stopping time for any procedure in Δα,β,l,u\Delta_{\alpha,\beta,l,u} is

LA(α,β;l,u)={max{lnβη0A,lnαη1A+η0A},A=l max{lnβη0A,lnαη1A},l<A<u max{lnαη1A,lnβη0A+η1A},A=uL_A(\alpha,\beta;l,u) = \begin{cases} \max\left\{ \frac{|\ln\beta|}{\eta_0^A}, \frac{|\ln\alpha|}{\eta_1^A+\eta_0^A} \right\}, & |A|=l \ \max\left\{ \frac{|\ln\beta|}{\eta_0^A}, \frac{|\ln\alpha|}{\eta_1^A} \right\}, & l < |A| < u \ \max\left\{ \frac{|\ln\alpha|}{\eta_1^A}, \frac{|\ln\beta|}{\eta_0^A+\eta_1^A} \right\}, & |A|=u \end{cases}

The Weighted Gap–Intersection procedure achieves first-order optimality: limα,β0EA[TW,GI]LA(α,β;l,u)=1.\lim_{\alpha,\beta \to 0} \frac{\mathbb{E}_A[T_{W,GI}]}{L_A(\alpha,\beta;l,u)} = 1. The proof utilizes reduction to a collection of independent random walks crossing boundaries and matches the lower bound up to vanishing second-order terms.

6. High-Dimensional and Random-Weights Analysis

(a) Fixed weights, large JJ: The following scaling conditions

J=o(κ1/4),maxjlnWj=o(κ),minjlnWj=o(κ),κ=ln(αβ)J = o(\kappa^{1/4}),\quad \max_j \ln W_j = o(\kappa),\quad -\min_j \ln W_j = o(\kappa),\quad \kappa = |\ln(\alpha \wedge \beta)| \to \infty

ensure first-order optimality. If all WjW_j are uniformly bounded, only J=o(κ1/4)J = o(\kappa^{1/4}) is required.

(b) Random weights: For weights WW \sim known law (drawn once before sampling), adaptive thresholding (e.g., c(W)=lnα+lnCW(l,J)c(W) = |\ln \alpha| + \ln \mathcal C_W(l,J)) guarantees conditional FWE control. Unconditional expected sample size remains optimal if

J=o(κ1/4),E[lnmaxjWj]=o(κ),E[lnminjWj]=o(κ)J = o(\kappa^{1/4}),\quad \mathbb{E}[\ln \max_j W_j] = o(\kappa),\quad \mathbb{E}[-\ln \min_j W_j] = o(\kappa)

Admissible laws include bounded, binary, log-normal, and Pareto weights provided extreme statistics grow only poly-logarithmically in JJ.

7. Simulation Study and Empirical Behavior

Simulations conducted under the Gaussian means model XijN(μj,1)X_i^j \sim N(\mu_j, 1) with μj{0,0.15}\mu_j \in \{0, 0.15\} target error rates α=β=0.05\alpha = \beta = 0.05 and signal counts to hypotheses ratios typical for high-dimensional inference (J=200,300,400J = 200, 300, 400, m/J=0.1m/J = 0.1).

Four weight scenarios were tested:

  • Unweighted: Wj1W_j \equiv 1
  • Informative: true signals receive W=r1W = r \gg 1 with probability favoring correct identification
  • Misinformative: weights anti-correlated to true signals
  • Noisy: weights independent, mean-1 with controlled variance

Empirical effective sample size (ESS) is lowest when weights are informative, outperforming the unweighted baseline. When weights are noisy or misleading, ESS exceeds baseline, consistent with the theoretical second-order term. This suggests the practical robustness of the procedure, but also cautions against poorly chosen weights.

Summary Table: Procedural Features

Property Weighted Gap-Intersection Classic Sequential (Unweighted)
Prior weights incorporated Yes No
Signal count bounds allowed lAul \le |A| \le u Often fixed or unconstrained
Error control (FWE) Explicit (via thresholds) Varies; often only Type I
High-dimensional scalability J=o(κ1/4)J=o(\kappa^{1/4}) Typically limited
ESS matches lower bound Yes (first-order) Yes for gap/intersection variants

A plausible implication is that weighted procedures are especially advantageous when reliable external information about hypothesis relevance is available. In summary, the Weighted Gap-Intersection procedure generalizes multiple sequential testing to exploit prior-weighted evidence, maintains rigorous error guarantees, and achieves information-theoretic optimality in broad operational regimes (Bose et al., 10 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Weighted Gap-Intersection Procedure.