Adaptive FDP Estimators

Updated 17 December 2025

Adaptive FDP estimators are statistical procedures that use data-driven methods to accurately estimate the false discovery proportion while ensuring finite-sample guarantees.
They employ adaptive techniques, including linear step-up procedures, competition-based bounds, and high-dimensional regression adjustments, to improve power over classical methods.
Empirical evaluations and theoretical guarantees demonstrate that these estimators offer tight simultaneous confidence envelopes and scalability, even under differential privacy constraints.

An adaptive FDP estimator is a statistical procedure or formula that estimates or bounds the false discovery proportion (FDP), defined as the ratio $V/R$ of the number $V$ of false discoveries (true nulls rejected) to the total number $R$ of rejections, in a way that leverages data-driven or parameter-adaptive choices, often with finite-sample and/or distributional guarantees. The recent literature examines adaptive FDP estimation under classical multiple testing (via p-values), competition-based FDR control (e.g., knockoffs), high-dimensional regression, and differential privacy, with several lines of inquiry on estimator consistency, simultaneous confidence envelopes, post-hoc flexibility, and efficiency.

1. Fundamental Definitions and Context

Let $H_1,\dots,H_m$ denote $m$ null hypotheses tested simultaneously. Given a selection rule that rejects $R$ hypotheses (among which $V$ are falsely rejected true nulls), the false discovery proportion is

$\text{FDP} = \frac{V}{R} \quad (0/0 := 0),$

and the false discovery rate is $\text{FDR} = \mathbb{E}[\text{FDP}]$ . Adaptive FDP estimation is distinguished by procedures that depend on data-driven estimators of nuisance parameters, typically the proportion of true nulls, or involve post hoc bounds informed by the observed data structure (Ditzhaus et al., 2018, Ebadi et al., 2023, Hemerik et al., 2022, Jeng et al., 2018).

2. Adaptive FDP Estimators in Classical Multiple Testing

Adaptive linear step-up procedures, generalizing the Benjamini-Hochberg (BH) procedure, use an estimator $\widehat m_0$ for the (unknown) number $m_0$ of true nulls. This enables more powerful testing by adapting critical values: $\widehat\alpha_{i:m} = \min \left\{ \frac{i}{\widehat m_0} \alpha, \; \lambda \right\}, \quad i = 1,\dots, m,$ with $\lambda \in [\alpha,1)$ a tuning constant. Estimating $m_0$ (e.g., via convex combinations of generalized Storey-type estimators) and restricting to p-values below $\lambda$ yields exact formulas for all moments of the FDP: $\text{FDR}_m = \mathbb{E}\left[ \frac{\alpha}{\lambda} \frac{V_m(\lambda)}{\widehat m_0} \right],$

$\text{Var}\left( \frac{V_m}{R_m} \right) = \frac{\alpha^2}{\lambda^2}\left[ \frac{\lambda}{\alpha} \mathbb{E}\left[\frac{V_m(\lambda)}{\widehat m_0} \mathbb{E}[R_m^{(1,\lambda)}{}^{-1} \mid \mathcal{F}_\lambda]\right] + \text{Var}\left(\frac{V_m(\lambda)}{\widehat m_0}\right) - \mathbb{E}\left(\frac{V_m(\lambda)}{\widehat m_0^2}\right) \right]$

and similarly for higher moments (Ditzhaus et al., 2018). Estimator stability and abundance of rejections are necessary and sufficient for consistency ( $\text{FDP}_m - \text{FDR}_m \to 0$ in probability).

A large class of adaptive estimators involves bin-wise partitions $(\lambda_{i-1}, \lambda_i]$ and weights $\beta_i$ ,

$\widetilde m_0 = \sum_{i=1}^k \beta_i\,\widetilde m_0(\lambda_{i-1}, \lambda_i), \qquad \widehat m_0 = \min \left\{ \widetilde m_0, \frac{\alpha}{\lambda} R_m(\lambda) \right\}.$

These are shown to control FDR at finite $m$ , with strictly improved asymptotic FDR and power over BH when $\kappa_0 < 1$ (Ditzhaus et al., 2018).

3. Data-Adaptive FDP Bounds and Simultaneous Envelopes

Recent developments provide median-unbiased and simultaneous envelopes for the FDP, valid over entire rejection paths and allowing post hoc choice of target FDP (denoted $\gamma$ ). Under symmetry and mild stochastic ordering of true null p-values, one has

$V(t) \leq \overline V(t) := \#\{i : p_i \geq 1 - t\},$

yielding the adaptive bound

$\overline{\text{FDP}}(t) := \frac{\overline V(t)}{R(t)},$

with $P[\text{FDP}(t) \leq \overline{\text{FDP}}(t)] \geq 0.5$ . To validly select $t$ post hoc, the envelope is strengthened: $\tilde B(t) = \min\{B \in \mathcal{B} : B(t) \geq \overline V(t) \; \forall t \in T\},$ ensuring $P[\forall t \in T: \text{FDP}(t) \leq \tilde B(t)/R(t)] \geq 0.5$ (Hemerik et al., 2022). These approaches yield computationally efficient algorithms (linear time after sorting) and mFDP-controlling procedures, with enhanced flexibility and interpretable adjusted p-values.

4. Competition-Based Adaptive FDP Bounds

In competition-based FDR control (e.g., knockoffs, target-decoy competition), adaptive FDP bounds are constructed via negative-binomial processes. For each hypothesis $i$ , one defines target and decoy wins, with the label $L_i = +1$ or $-1$ . For any cutoff $k$ , denote $D_k$ the decoy count and $T_k$ the target wins, with FDP $Q_k = V_k/(T_k \vee 1)$ .

Two principal adaptive upper bands for FDP are developed:

Standardized band (TDC-SB):

$\xi_d^{SB} = B d + z^{1-\gamma}_\Delta \sqrt{B(1+B)d},$

with $U_d \sim \text{NB}(d, R)$ and $z^{1-\gamma}_\Delta$ the quantile of the standardized process.

Uniform band (TDC-UB):

$\xi_d^{UB} = \beta_d^{1-u_\gamma(\Delta)},$

where $u_\gamma(\Delta)$ is the largest threshold guaranteeing $P[\exists d : \tilde U_d \leq u] \leq \gamma$ .

Both bands are shown to adapt tightly to the data (especially for small decoy counts) and empirically outperform the Katsevich–Ramdas bound in diverse settings, maintaining finite-sample exactness and scalability (Ebadi et al., 2023).

5. High-Dimensional Regression: Adaptive FDP Estimation

In high-dimensional regression, the de-sparsified Lasso (“DLasso”) estimator provides adaptive FDP estimation for variable selection. The statistic

$z_j = \sqrt{n}\hat b_j / [\sigma \sqrt{\hat \Omega_{jj}}],$

is used to rank predictors. The plug-in FDP estimator

$\widehat{\text{FDP}}(t) = \frac{2p\Phi(-t)}{R_z(t) \vee 1},$

approximates the expected number of false discoveries via Normal tail probabilities. The threshold $t_\alpha = \inf\{t : \widehat{\text{FDP}}(t) \leq \alpha\}$ yields consistent FDP control under standard design and sparsity assumptions (Jeng et al., 2018).

6. Adaptive Estimation under Differential Privacy Constraints

Federated differential privacy (FDP) introduces new challenges for adaptation. In federated density estimation, servers add carefully tuned exponential noise in a multiscale oscillation norm to wavelet coefficient estimates, yielding (ε,0)-FDP privacy: $T_{lk}^{(j)} = \hat f_{lk}^{(j)} + V_{lk}^{(j)}, \quad V_{lk}^{(j)} \text{ sampled from density } \propto \exp(-\epsilon n \|v\|_{V_{L^*}}).$ Post-processing via block thresholding produces estimators attaining sharp adaptive rates: $E_f\|\hat f - f\|_2^2 \lesssim N^{-2\alpha/(2\alpha+1)} + \left( \frac{\log N}{mn^2\epsilon^2} \right)^{2\alpha/(2\alpha+2)},$ with analogous bounds for pointwise estimation. Lower bounds demonstrate that adaptation in global risk under FDP incurs an intrinsic $\log N$ factor in the privacy term, and the pointwise risk incurs two such factors, reflecting the unavoidable privacy-adaptation trade-off (Cai et al., 16 Dec 2025).

7. Comparative Properties and Empirical Performance

Adaptive FDP estimators expand on classical FDR control by:

Providing exact finite-sample moment formulas for the FDP (Ditzhaus et al., 2018).
Enabling post hoc selection of target FDP levels with simultaneous envelope guarantees (Hemerik et al., 2022).
Offering tighter bounds, especially at low decoy counts, when benchmarking against existing simultaneous FDP-control bands (Ebadi et al., 2023).
Achieving consistency and minimax rates, even in challenging regimes (high-dimensional, privacy constrained), where classical approaches may suffer conservatism or inefficiency (Jeng et al., 2018, Cai et al., 16 Dec 2025).

Empirical evaluations confirm that adaptive procedures maintain nominal control and improve power or tightness against competitors. A plausible implication is that envelope-based or negative-binomial-process bounds can yield sharper guarantees and computational tractability, even in the presence of dependence or unknown null proportions.

In summary, adaptive FDP estimators encompass a broad family of data-driven procedures for simultaneous multiple testing, variable selection, competition frameworks, and privacy-preserving inference. They achieve finite-sample validity, enhanced flexibility, and improved performance relative to traditional mean-FDP control, with rigorous consistency and minimax adaptivity in diverse statistical models.