Expected Discovery Rate in Testing
- EDR is defined as the expected proportion of true discoveries among nonnull hypotheses, offering a clear measure of statistical power in multiple testing.
- Plug-in and weighted procedures adjust rejection thresholds—using kernel estimates and informative weights—to enhance true discovery rates while maintaining FDR control.
- Online algorithms (e.g., LORD, LOND) and frameworks like e-Closure dynamically adapt significance levels, balancing error control with maximized EDR in sequential testing.
Expected Discovery Rate (EDR) is a central concept in the theory and practice of multiple hypothesis testing, representing the expected proportion or number of true discoveries (correct rejections) made by a testing procedure. It provides a quantitative assessment of the statistical power achieved under error control constraints such as the False Discovery Rate (FDR). The technical literature addresses EDR both directly—optimizing power under FDR control—and indirectly—by analyzing the performance (yield of true discoveries) across a variety of adaptive, weighted, and online algorithms.
1. Formal Definition and Relationship to FDR
In the context of multiple hypothesis testing, suppose among tested hypotheses, %%%%1%%%% is the number of false rejections (type I errors) and is the total number of rejections. The False Discovery Rate is defined as where the denominator ensures well-definedness in the event of zero rejections.
The Expected Discovery Rate, Editor's term (EDR), is designed to quantify statistical power in this context. In its common formulation,
where is the number of true rejections (type II errors avoided), and the number of true alternatives (nonnulls). Maximizing EDR under controlled FDR is a principal objective in the design of multiple testing procedures (Neuvial, 2010, Basu et al., 2015, Nie et al., 2023).
2. Plug-In Procedures and Asymptotic EDR Enhancement
Plug-in approaches estimate the proportion of true null hypotheses and accordingly inflate the rejection threshold, yielding procedures with tighter FDR control and asymptotically higher EDR:
- For independent hypotheses, the Benjamini-Hochberg (BH) procedure sets the FDR at , with being the true null proportion. The oracle procedure applies BH at , controlling FDR at . Since is unknown, plug-in methods estimate it, often using kernel-based estimators on the -value density near 1 (Neuvial, 2010).
- The plug-in threshold yields more powerful asymptotics (higher EDR) because a larger fraction of nonnulls is rejected, especially as the number of hypotheses and for a wider range of FDR levels.
- The trade-off is a slower convergence rate for the realized FDP, namely, at non-parametric rates , depending on the regularity of the -value density at 1, in contrast to the classical BH which enjoys the parametric rate .
- In models with well-behaved alternative -value distributions (e.g., two-sided Gaussian, Laplace, Student’s ), the plug-in procedure's EDR approaches that of an oracle BH at for increasing , often far outperforming the standard BH method, but at the price of greater variability for small .
3. Weighted Multiple Testing and Decision-Theoretic Optimization
Weighted FDR (wFDR) procedures extend classical FDR control by imposing distinct importance weights ( for true positives, for false discoveries) reflecting external biological or scientific information (Basu et al., 2015). The wFDR criterion is
with indicating rejection of hypothesis , and its status (null vs. nonnull).
The corresponding power metric is the Expected True Positives (ETP):
Oracle and data-driven procedures are constructed to maximize ETP subject to . In practice, this framework allows prioritization of hypotheses (e.g., up-weighting SNPs with relevant prior evidence in GWAS) to achieve increased EDR in informative groups while maintaining stringent error control.
- The oracle procedure sorts hypotheses by a value-to-capacity ratio (VCR) and fills the rejection set for according to the available wFDR “budget,” maximizing true discoveries.
- The data-driven analog uses estimated local false discovery rates () and a ranking statistic to find a threshold maximizing ETP while approximately controlling wFDR.
- Asymptotic analysis shows that the data-driven procedure achieves nearly the same ETP and wFDR as the oracle as .
4. Online Testing and Discovery Rate Dynamics
Online FDR-controlling algorithms such as LOND (Levels based On Number of Discoveries), LORD (Levels based On Recent Discovery), and generalized alpha-investing rules adaptively allocate significance thresholds as hypotheses are tested sequentially (Javanmard et al., 2015, Javanmard et al., 2016, Robertson et al., 2018).
- LOND and LORD increase EDR by boosting thresholds after successful discoveries, with LORD restarting at higher levels following each rejection. Theoretical guarantees establish nearly linear growth ( or ) for the number of discoveries (and thus EDR) as , under mild assumptions on the sequence .
- These methods perform comparably to offline BH in terms of EDR but are applicable when hypotheses arrive over time and full batch information is unavailable.
- Adjustments for dependent -values (e.g., harmonic sum scaling in LOND) preserve FDR control but may dampen power, reducing EDR compared to the idealized independent case.
- Empirical studies (in microarray, GWAS, and clinical trial contexts) confirm that the online procedures maintain FDR control and deliver high EDR, especially when a bounded upper limit on is used (i.e., for finite streams).
5. General Principles and the e-Closure Framework
The e-Closure Principle offers a unifying framework for multiple testing by “closing” over all subsets of hypotheses and simultaneously enforcing expected loss conditions, including both FDR and EDR control (Xu et al., 2 Sep 2025).
- Given an e-collection and a loss function , the closed procedure selects if for every subset .
- For FDR control, recovers the classical ratio, while for EDR control, choosing to attach weight to the number of true discoveries allows generalization to "expected discovery rate" or other metrics of interest.
- The framework allows post hoc flexibility: after the data are seen, one can select the error metric or nominal level to be controlled, as all candidate rejection sets returned by the closed procedure meet the requisite expected loss bounds.
- The closure principle enables uniform improvements to standard methods (e.g., e-Benjamini–Hochberg, e-Benjamini–Yekutieli) and supports simultaneous error control across many candidate sets, directly translating to more robust and potentially higher EDR via wider valid choices for rejection sets.
6. Asymptotic Limits, Trade-Offs, and Optimality
Recent advances characterize the fundamental limits of simultaneous FDR and EDR (or FNR) control in large-scale settings (Nie et al., 2023):
- The optimal trade-off between FDR and EDR cannot generally be achieved by separable decision rules; instead, compound or coordinated rules—sometimes yielding two-point representations via Carathéodory’s theorem—are required even in simple models such as the Gaussian mixture.
- For any feasible randomized strategy under an FDR constraint , the minimal expected “loss” or maximal EDR is achieved by a two-point random variable. This sufficiency result signifies that multiple testing procedures can be reduced (without loss) to decision rules with thresholding at only two points.
- In settings where the false discovery proportion (FDP) must be controlled with high probability (rather than expectation), the optimal trade-off for EDR coincides with that for marginal FDR (mFDR).
7. Applications, Implications, and Limitations
The interplay between FDR and EDR is foundational for reproducibility, interpretability, and scientific yield, particularly in genomics, medical studies, and high-dimensional data analysis.
- By design, FDR control ensures the average rate of false discoveries is low, indirectly raising EDR by permitting more true discoveries at higher thresholds.
- Adaptive, weighted, and online procedures enhance the opportunity for true discoveries (EDR) but introduce trade-offs in terms of statistical variability, convergence rate, and computational overhead.
- Plug-in, closure-based, and compound oracle procedures, supported by rigorous asymptotic theory, guarantee that as the scale of testing grows, statistical procedures can approach optimal EDR under stringent error control.
Despite advances, finite-sample fluctuations, dependence among -values, and issues around the estimation or choice of error rates remain practical limitations affecting EDR. Methodological choices (kernel estimation bandwidth, weighting schemes, significance level allocation) must balance power against the risk of excess false discoveries.
Summary Table: EDR Optimization Across Approaches
Procedure Type | EDR Optimization Mechanism | Trade-off / Limitation |
---|---|---|
Plug-in Kernel Estimator | Oracle-level FDR, inflated threshold | Nonparametric convergence |
Weighted FDR | ETP maximization via informative weights | Weight specification, estimation |
Online Testing (LORD/LOND) | Adaptive threshold through feedback | Dependency adjustment, finite-sample power |
e-Closure Principle | Uniform control/post hoc flexibility | Computational complexity |
Compound Oracle | Two-point strategy, optimal trade-off | Model assumption, implementability |
These results collectively establish the theoretical and practical landscape for optimizing the Expected Discovery Rate in large-scale multiple testing, balancing it against strict control of the False Discovery Rate for reliable scientific discovery.