Benjamini-Hochberg FDR Procedure

Updated 15 January 2026

Benjamini-Hochberg FDR is a statistical method that controls the expected proportion of false discoveries in multiple hypothesis testing by using a step-up procedure on sorted p-values.
It is widely applied in genomics, neuroimaging, and high-dimensional inference, and its adaptive, weighted, and discrete extensions enhance its power and applicability.
The procedure offers theoretical guarantees under independence and positive dependence, while modifications like the BY correction adjust for arbitrary dependency structures.

The Benjamini-Hochberg (BH) false discovery rate (FDR) procedure is a foundational approach for controlling the expected proportion of false rejections (Type I errors) among the hypotheses declared significant in large-scale multiple testing problems. Designed to be less stringent and thus more powerful than family-wise error rate (FWER) corrections, the BH procedure and its extensions have become standard in fields that routinely test hundreds to millions of hypotheses, such as genomics, neuroimaging, and high-dimensional inference.

1. Formal Definition of False Discovery Rate and the BH Step-Up Procedure

Given $m$ hypotheses $H_1, \dots, H_m$ with corresponding $p$ -values $p_1, \dots, p_m$ , the focus is on controlling the FDR: $\mathrm{FDR} = \mathbb{E}\left[\frac{V}{R \vee 1}\right]$ where $R$ is the total number of rejections and $V$ the number of false rejections (type I errors among the discoveries). Let $m_0$ denote the number of true null hypotheses (unknown in practice).

The original (step-up) BH procedure at nominal FDR level $\alpha \in (0,1)$ operates as follows (Acharya, 2014, Benditkis et al., 2015, Wang, 2022):

Sort observed $p$ -values in increasing order: $p_{(1)} \leq \cdots \leq p_{(m)}$ .
Define critical values $t_i = \frac{i}{m} \alpha$ for $i=1, \dots, m$ .
Let $k = \max\{i : p_{(i)} \leq t_i\}$ . If no such $i$ exists, set $k=0$ .
Reject all hypotheses with $p_i \leq p_{(k)}$ .

This guarantees in standard settings that the expected proportion of false discoveries among those rejected is controlled at the desired level.

2. Theoretical Guarantees for FDR Control

Under the assumption that the null $p$ -values are independent and Uniform(0,1), and independent from the alternative $p$ -values, the BH procedure satisfies (Acharya, 2014, Benditkis et al., 2015, Wang, 2022): $\mathrm{FDR} \leq \frac{m_0}{m} \alpha \leq \alpha$ A key proof approach uses optional stopping and backward martingale properties of the true null count process. For each possible total number of rejections $R$ , the probability that a true null $p$ -value passes its threshold is at most $\alpha R/m$ . Taking linearity over all $m_0$ nulls, and integrating over the stopping rule, yields the FDR bound.

This guarantee extends to certain dependency structures among $p$ -values. In particular, positive regression dependence on a subset (PRDS) is sufficient for the same FDR bound (Wang, 2022): $\mathrm{FDR} \leq \frac{m_0}{m} \alpha$ However, under arbitrary dependence, a further correction is needed. The Benjamini-Yekutieli (BY) procedure adjusts $\alpha$ to $\alpha/\sum_{i=1}^m 1/i$ and still ensures FDR control at the nominal level for all dependency structures.

3. Enhancements: Adaptive, Weighted, Structural, and Discrete BH Extensions

Adaptive Procedures

If the true null proportion $\pi_0 = m_0/m$ is less than 1, the BH procedure becomes conservative. Storey's adaptive BH (Gao, 2023) estimates $\pi_0$ from the data using, for instance,

$\hat{\pi}_0(\lambda) = \frac{1 + \#\{p_j \geq \lambda\}}{m (1 - \lambda)}$

Adaptive BH replaces $\alpha$ with $\alpha/\hat{\pi}_0$ , providing FDR control at the nominal level and substantial power gains, especially in settings with many non-nulls or conservative null distributions. Recent work introduces stopping-time selection of $\lambda$ and martingale-based proofs of exact FDR control (Gao, 2023).

Weighted and Structural Extensions

When hypotheses are partitioned into groups or structured in space, power can be increased via group- or location-adaptive weights. The structure-adaptive BH algorithm (SABHA) (Li et al., 2016) selects adaptive weights $q_i$ reflecting prior information or estimated signal enrichment under explicit structural constraints (grouped, hierarchical, spatial smoothness).

In the generalized weighted BH framework (Nandi et al., 2021), each hypothesis has a weight $W_i$ and is rejected if $P_i \leq (W_i i/N)\alpha$ , provided the sum-of-inverses calibration condition is satisfied to maintain FDR control.

Discrete and Mid p-value Modifications

The standard BH procedure assumes continuous $p$ -values; for discrete $p$ -values (common in exact or permutation tests), BH can become overly conservative (Döhler et al., 2017). Discrete-adaptive BH (DBH) recalibrates rejection thresholds using known null CDFs $F_i(\cdot)$ to exploit attainable $p$ -value supports and often achieves substantial power gains, while preserving FDR control under independence.

For mid $p$ -values, which are sub-uniform, applying the BH procedure can either be conservative or exceed the FDR bound depending on the largest atom; various sharp bounds and adaptive modifications exist for rigorous control in this setting (Chen et al., 2019).

4. Extensions: Dependence, Multivariate, and Functional Testing

Dependent $p$ -values

When $p$ -values are dependent (e.g., under spatial, temporal, or other forms of correlation), the classical procedure may become anti-conservative. The Benjamini-Yekutieli correction (Wang, 2022) and recent conditional calibration methods (Fithian et al., 2020) provide dependence-adjusted thresholds ensuring FDR control under various forms of positive or arbitrary dependence.

The dependence-adjusted BH (dBH) procedure calibrates a threshold for each hypothesis accounting for dependence structure, and dominates BY in power, particularly under positive regression dependence (PRDS), with theoretical and empirical validations (Fithian et al., 2020).

Multivariate and Functional Testing

For hypotheses with multivariate test statistics $Z_i$ (e.g., vectors of $z$ -values), the generalized BH framework builds nested rejection regions $\{\mathcal{R}_t\}$ that monotonically increase with $t$ , each calibrated to the null distribution, and employs step-down or step-up algorithms to identify optimal power regions subject to FDR control (Alishahi et al., 2016). This approach extends naturally to local false discovery rate (Lfdr) thresholding for oracle-optimality.

For infinite (functional) hypothesis families, a continuous BH procedure adapts to the Lebesgue measure of the rejection set (the "volume" of discoveries), yielding rigorous control over the functional FDR (fFDR) in domains such as spatial or temporal environmental data (Olsen et al., 2019).

5. Algorithmic, Differential Privacy, and Applicability in High-Dimensional Settings

Algorithmic Aspects

BH and its step-up/step-down variants have simple, computationally efficient implementations, typically involving sorting and thresholding. Weighted, adaptive, and discrete variants may require precomputing groupwise null proportions, null CDFs, or solving structured optimization problems, but remain feasible at current scales (hundreds of thousands of tests) (Acharya, 2014, Nandi et al., 2021, Li et al., 2016).

Differential Privacy

When $p$ -values are computed over sensitive data, FDR control must be preserved under privacy constraints. Differentially private versions of BH incorporate noise via Laplace perturbations to $p$ -values, combined with private top- $k$ selection and suitably shrunk critical thresholds, yielding rigorous $(\epsilon,\delta)$ -privacy and near-optimal power (Dwork et al., 2015, Dwork et al., 2018). Backward submartingale arguments ensure that FDR increases by only a small multiplicative factor.

Power Considerations

The average power (expected true positive fraction) and tail power (e.g., $\lambda$ -power) under the BH procedure are explicitly characterized by law of large numbers and central limit theorems for the rejected hypothesis set as $m\to\infty$ (Izmirlian, 2018). These inform optimal design and sample size planning in genomics and other high-dimensional applications.

6. Practical Implications and Recommendations

The BH FDR procedure provides a mathematically rigorous, scalable alternative to FWER methods (e.g., Bonferroni), with guaranteed control under independence and PRDS, and a wide array of extensions to the realistic scenarios of modern data (Acharya, 2014, Wang, 2022).
Whenever null $p$ -values are suspected to be non-uniform or conservative, adaptive or discrete modifications offer clear empirical and theoretical benefits (Döhler et al., 2017, Gao, 2023).
For dependent or structured hypothesis families, one should use BY correction, dBH, or SABHA, according to the hypothesized signal structure and known dependencies (Fithian et al., 2020, Li et al., 2016).
In privacy-sensitive settings, differentially private BH variants allow release of FDR-controlled discoveries with vanishing loss in accuracy as data size increases (Dwork et al., 2015, Dwork et al., 2018).
For infinite or functional hypothesis spaces, use the continuous measure-based analogs of BH, which retain thresholding geometry and FDR control (Olsen et al., 2019).

Summary Table of Core Benjamini-Hochberg FDR Procedures and Key Extensions

Procedure	FDR Guarantee Condition	Extension/Concept
BH (step-up)	Independence, PRDS	Classical baseline (Acharya, 2014, Benditkis et al., 2015)
BY correction	Arbitrary dependence	$\alpha / \sum_i (1/i)$
Storey's adaptive BH	Data-estimated $\hat{\pi}_0$	Boosts power under sparse signals
Weighted/Structured BH	Sum-of-inverses calibration	Grouped, spatial, hierarchical
Discrete-adaptive BH	Null CDFs known, independence	For finite support $p$ -values
dBH (conditional calib.)	Known dependence structure	Uniformly dominates BY
Differentially private	$(\epsilon,\delta)$ -privacy	Minor penalty in FDR/power
Generalized/MV/functional	Nested region, measure geometry	Multivariate, infinite hypotheses

7. References

Y. Benjamini and Y. Hochberg, "Controlling the false discovery rate: A practical and powerful approach to multiple testing," J. Roy. Statist. Soc. B 57(1): 289–300 (Acharya, 2014, Benditkis et al., 2015).
Y. Benjamini and D. Yekutieli, "The control of the false discovery rate in multiple testing under dependency," Ann. Statist. 29(4): 1165–1188 (Wang, 2022).
Armstrong, T., "False Discovery Rate Adjustments for Average Significance Level Controlling Tests" (Armstrong, 2022).
Blanchard, P., Neuvial, P., Roquain, E., "Improving the Benjamini-Hochberg Procedure for Discrete Tests" (Döhler et al., 2017).
Chen, X., Sarkar, S. K., "On Benjamini-Hochberg procedure applied to mid p-values" (Chen et al., 2019).
Sabatti, C., Roquain, E., Sun, W., "Multiple testing with the structure adaptive Benjamini-Hochberg algorithm" (Li et al., 2016).
Nandi, S. S., Sarkar, S. K., "Controlling the False Discovery Rate in Complex Multi-Way Classified Hypotheses" (Nandi et al., 2021).
Gaboardi, M., Rogers, R., Vadhan, S., "Private False Discovery Rate Control" (Dwork et al., 2015).
Dwork, C., Smith, A., "Differentially Private False Discovery Rate Control" (Dwork et al., 2018).