False Discovery Rate & Power Analysis

Updated 24 September 2025

False Discovery Rate and Power Analysis is a framework that balances controlling false positives with maximizing detection power in large-scale multiple testing.
The FDP-SD procedure uses a sequential stepdown approach and binomial bounds to ensure that the realized false discovery proportion remains below a set threshold with high confidence.
Empirical studies in applications such as mass spectrometry and high-dimensional regression show that FDP-SD offers improved reliability with only a modest power loss compared to traditional FDR methods.

False Discovery Rate (FDR) quantifies the expected proportion of false positives among rejected hypotheses and is a central concept in large-scale multiple testing. Power analysis, in this context, evaluates the probability of correctly identifying true signals under multiplicity correction. The interplay between FDR control and power has shaped the design, interpretation, and methodological evolution of multiple testing in fields ranging from genomics to econometrics. Rigorous FDR control is essential, but it does not guarantee that the realized false discovery proportion (FDP) in an experiment will not substantially exceed the nominal level. Recent developments address this discrepancy by introducing procedures, such as FDP stepdown (FDP-SD), that bound the FDP with high confidence while preserving or improving statistical power in “competition-based” testing frameworks.

1. Competition-Based Methods and Limitations of Standard FDR Control

In competition-based multiple testing, such as target-decoy competition (TDC) in mass spectrometry and knockoff filters in high-dimensional regression, discoveries are made by direct comparison of each hypothesis’ "target" statistic (from observed data) to a "decoy" or knockoff counterpart (from a null or synthetic construction). For feature i, let Z_i be the target score and $\tilde{Z}_i$ the decoy or knockoff score; the label $L_i$ encodes which wins: $L_i = 1$ if $Z_i > \tilde{Z}_i$ , $L_i = -1$ if $Z_i < \tilde{Z}_i$ . Procedurally, target wins are interpreted as putative discoveries.

The critical property leveraged is that under the true null, target and decoy scores are exchangeable, so each has a 0.5 probability of “winning.” Established methods such as knockoff+ and standard TDC utilize this property to estimate and control the expected false discovery rate—by, for instance, using the empirical ratio of decoy wins to target wins among selected features.

However, standard FDR control (i.e., ensuring $E[\mathrm{FDP}] \leq \alpha$ ) does not preclude large observed FDP in any single realization, particularly in finite sample or sparse signal regimes. Even when FDR is strictly controlled at level $\alpha$ , the tail probability $\mathbb{P}(\mathrm{FDP} > \alpha)$ can be significant, rendering interpretations precarious in specific datasets.

2. The FDP Stepdown (FDP-SD) Procedure

FDP-SD is a sequential, stepdown procedure designed for competition-based multiple testing that allows the practitioner to bound the realized FDP below $\alpha$ with high confidence, i.e., attain $\mathbb{P}(\mathrm{FDP} > \alpha) \leq \gamma$ for user-selected $\gamma$ .

For $m$ hypotheses, the procedure operates as follows:

Compute the sequence of target and decoy wins $L_j$ for $j = 1, \ldots, m$ , typically sorted in decreasing order of a chosen score (e.g., $\max(Z_j, \tilde{Z}_j)$ ).
For each cutoff position $i$ , let $D_i$ be the cumulative number of decoy wins up to $i$ , and $T_i = i - D_i$ the cumulative number of target wins.
Precompute thresholds $\delta_{(\alpha, \gamma)}(i)$ as the maximal integer $d$ such that:

$F_{B(\lfloor(i-d)\alpha\rfloor+1+d,\, \frac{1}{2})}(d) \leq \gamma$

where $F_{B(n, 1/2)}(d)$ is the binomial cumulative distribution function and $k(i,d) = \lfloor (i-d)\alpha \rfloor + 1$ .

Define $i_0$ as the smallest $i$ such that $\delta_{(\alpha, \gamma)}(i) \geq 0$ .
The threshold $k_\mathrm{FDP-SD}$ is the largest index $i \geq i_0$ for which $D_j \leq \delta_{(\alpha, \gamma)}(j)$ for all $j \in [i_0, i]$ .

The output is the list of target wins among the top $k_\mathrm{FDP-SD}$ hypotheses.

Theoretical guarantees ensure that with probability at least $1-\gamma$ , the empirical FDP among selected hypotheses does not exceed $\alpha$ , i.e.,

$\mathbb{P}\left( \frac{\text{null target wins}}{T_{k_\mathrm{FDP-SD}} \vee 1} > \alpha \right) \leq \gamma$

This high-probability control is achieved via a sharp binomial modeling of the null competition and a sequential stepdown to the maximal set of discoveries consistent with the probabilistic bound.

3. Comparisons with Simultaneous Confidence Band Framework

An alternative approach to FDP control in the literature is the general simultaneous confidence band framework of Katsevich and Ramdas, which can be specialized to the competition setting. This procedure computes for each possible cut a band guaranteeing, simultaneously for all $k$ , that the FDP among top $k$ discoveries does not exceed a bound with high probability. In the competition context, this specializes to an FDP-KRB rule.

The core distinction is that FDP-SD uses precomputed, fixed bounds per-cut and a stepdown sequence, while the Katsevich–Ramdas band adaptively determines FDP levels based on prediction intervals for the number of false discoveries before a given number of decoy wins. In simulation and real-data experiments, FDP-SD consistently achieves higher statistical power—that is, includes more true discoveries—than FDP-KRB at the same confidence level $\gamma$ . For example, in simulated spectrum identification studies, FDP-SD yields approximately 7% more correct discoveries across a wide regime of parameter settings.

4. Simulation Studies and Empirical Data Performance

Empirical evidence from simulation studies and real-world applications demonstrates the effectiveness of FDP-SD:

In mass spectrometry spectrum identification, FDP-SD yields higher power than confidence band approaches at the same FDP confidence level, and only a minor power loss compared to standard, expectation-based FDR-control TDC.
In high-dimensional regression variable selection with model-X knockoff statistics, FDP-SD consistently selects more true features than band-based alternatives under the same probability of exceeding a prescribed FDP level.
In real-data GWAS and peptide identification benchmarks, repeated runs indicate that FDP-SD not only delivers more discoveries for controlled error but also maintains the probability of excess FDP below the user threshold (e.g., exceeding 5% only about 4% of the time at a target of 5%).
The power loss relative to expectation-based FDR procedures is generally small (median loss ~6-7%), while the interpretability and reliability of the selected list are substantially improved.

5. Conceptual Foundations: Knockoffs, TDC, and Assumptions

Both knockoff and TDC methodologies are built upon an exchangeability principle: for null hypotheses, the competition between original and synthetic quantities (statistical scores) is a fair coin flip, ensuring that, under the null, target and decoy victories are equally likely and independent of non-null structures.

FDP-SD exploits this symmetry. Under the assumption that, conditional on the scores and the non-null labels, the null features’ competition outcomes are i.i.d. Bernoulli $(1/2)$ , the binomial tail upper bounds guarantee high-probability control. The observed decoy win count in any set of discoveries acts as a proxy for unobservable null target wins; overrunning the precomputed binomial threshold signals that excess FDP may have occurred.

This conceptual symmetry is a central requirement for the validity of competition-based FDP control, and the theoretical properties hinge on this stochastic property of the test statistics.

6. Practical Implications for FDP and Power Control

FDP-SD delivers a control paradigm distinct from expectation-based FDR methods: the guarantee applies to the realized FDP in a single experiment rather than its mean across repetitions. This high-probability control is highly desirable in applications where the cost of a high FDP in a single analysis (e.g., peptide identifications in mass spectrometry, feature selection in genomics) can be substantial.

The procedure’s stepdown construction is computationally tractable: all binomial CDF bounds can be precomputed, and the cumulative decoy win sequence is readily updated across ranks.

Empirical results indicate that the slight power penalty relative to FDR-based TDC or knockoff+ procedures is more than compensated by the reliability and interpretability of the resulting discovery set, particularly in high-stakes or non-repeatable experiment scenarios.

7. Summary and Broader Context

FDP-SD establishes a rigorous method for controlling the realized FDP in competition-based multiple testing, ensuring $\mathbb{P}(\mathrm{FDP} > \alpha) \leq \gamma$ under explicit assumptions. Its design draws from and extends core ideas of knockoff and target-decoy competition—formulating the null outcome as an exchangeable coin flip and using binomial probabilistic bounds for high-probability control. Compared with simultaneous confidence-band methods (Luo et al., 2020), FDP-SD typically achieves higher statistical power in realistic data settings. Its adoption enables practitioners to make stronger, more reliable claims about the false discovery content of their selected sets, especially in regimes where the control of realized error is more informative than expected error rates.

These developments reflect a broader evolution in statistical methodology, shifting from controlling expected error rates to ensuring reliable results in specific (finite) samples, which is of particular importance in high-dimensional, high-throughput scientific application domains.

PDF Markdown Chat (Pro)

References (1)

Competition-based control of the false discovery proportion (2020)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to False Discovery Rate and Power Analysis.