Papers
Topics
Authors
Recent
2000 character limit reached

Adaptive FDP Estimators

Updated 17 December 2025
  • Adaptive FDP estimators are statistical procedures that use data-driven methods to accurately estimate the false discovery proportion while ensuring finite-sample guarantees.
  • They employ adaptive techniques, including linear step-up procedures, competition-based bounds, and high-dimensional regression adjustments, to improve power over classical methods.
  • Empirical evaluations and theoretical guarantees demonstrate that these estimators offer tight simultaneous confidence envelopes and scalability, even under differential privacy constraints.

An adaptive FDP estimator is a statistical procedure or formula that estimates or bounds the false discovery proportion (FDP), defined as the ratio V/RV/R of the number VV of false discoveries (true nulls rejected) to the total number RR of rejections, in a way that leverages data-driven or parameter-adaptive choices, often with finite-sample and/or distributional guarantees. The recent literature examines adaptive FDP estimation under classical multiple testing (via p-values), competition-based FDR control (e.g., knockoffs), high-dimensional regression, and differential privacy, with several lines of inquiry on estimator consistency, simultaneous confidence envelopes, post-hoc flexibility, and efficiency.

1. Fundamental Definitions and Context

Let H1,,HmH_1,\dots,H_m denote mm null hypotheses tested simultaneously. Given a selection rule that rejects RR hypotheses (among which VV are falsely rejected true nulls), the false discovery proportion is

FDP=VR(0/0:=0),\text{FDP} = \frac{V}{R} \quad (0/0 := 0),

and the false discovery rate is FDR=E[FDP]\text{FDR} = \mathbb{E}[\text{FDP}]. Adaptive FDP estimation is distinguished by procedures that depend on data-driven estimators of nuisance parameters, typically the proportion of true nulls, or involve post hoc bounds informed by the observed data structure (Ditzhaus et al., 2018, Ebadi et al., 2023, Hemerik et al., 2022, Jeng et al., 2018).

2. Adaptive FDP Estimators in Classical Multiple Testing

Adaptive linear step-up procedures, generalizing the Benjamini-Hochberg (BH) procedure, use an estimator m^0\widehat m_0 for the (unknown) number m0m_0 of true nulls. This enables more powerful testing by adapting critical values: α^i:m=min{im^0α,  λ},i=1,,m,\widehat\alpha_{i:m} = \min \left\{ \frac{i}{\widehat m_0} \alpha, \; \lambda \right\}, \quad i = 1,\dots, m, with λ[α,1)\lambda \in [\alpha,1) a tuning constant. Estimating m0m_0 (e.g., via convex combinations of generalized Storey-type estimators) and restricting to p-values below λ\lambda yields exact formulas for all moments of the FDP: FDRm=E[αλVm(λ)m^0],\text{FDR}_m = \mathbb{E}\left[ \frac{\alpha}{\lambda} \frac{V_m(\lambda)}{\widehat m_0} \right],

Var(VmRm)=α2λ2[λαE[Vm(λ)m^0E[Rm(1,λ)1Fλ]]+Var(Vm(λ)m^0)E(Vm(λ)m^02)]\text{Var}\left( \frac{V_m}{R_m} \right) = \frac{\alpha^2}{\lambda^2}\left[ \frac{\lambda}{\alpha} \mathbb{E}\left[\frac{V_m(\lambda)}{\widehat m_0} \mathbb{E}[R_m^{(1,\lambda)}{}^{-1} \mid \mathcal{F}_\lambda]\right] + \text{Var}\left(\frac{V_m(\lambda)}{\widehat m_0}\right) - \mathbb{E}\left(\frac{V_m(\lambda)}{\widehat m_0^2}\right) \right]

and similarly for higher moments (Ditzhaus et al., 2018). Estimator stability and abundance of rejections are necessary and sufficient for consistency (FDPmFDRm0\text{FDP}_m - \text{FDR}_m \to 0 in probability).

A large class of adaptive estimators involves bin-wise partitions (λi1,λi](\lambda_{i-1}, \lambda_i] and weights βi\beta_i,

m~0=i=1kβim~0(λi1,λi),m^0=min{m~0,αλRm(λ)}.\widetilde m_0 = \sum_{i=1}^k \beta_i\,\widetilde m_0(\lambda_{i-1}, \lambda_i), \qquad \widehat m_0 = \min \left\{ \widetilde m_0, \frac{\alpha}{\lambda} R_m(\lambda) \right\}.

These are shown to control FDR at finite mm, with strictly improved asymptotic FDR and power over BH when κ0<1\kappa_0 < 1 (Ditzhaus et al., 2018).

3. Data-Adaptive FDP Bounds and Simultaneous Envelopes

Recent developments provide median-unbiased and simultaneous envelopes for the FDP, valid over entire rejection paths and allowing post hoc choice of target FDP (denoted γ\gamma). Under symmetry and mild stochastic ordering of true null p-values, one has

V(t)V(t):=#{i:pi1t},V(t) \leq \overline V(t) := \#\{i : p_i \geq 1 - t\},

yielding the adaptive bound

FDP(t):=V(t)R(t),\overline{\text{FDP}}(t) := \frac{\overline V(t)}{R(t)},

with P[FDP(t)FDP(t)]0.5P[\text{FDP}(t) \leq \overline{\text{FDP}}(t)] \geq 0.5. To validly select tt post hoc, the envelope is strengthened: B~(t)=min{BB:B(t)V(t)  tT},\tilde B(t) = \min\{B \in \mathcal{B} : B(t) \geq \overline V(t) \; \forall t \in T\}, ensuring P[tT:FDP(t)B~(t)/R(t)]0.5P[\forall t \in T: \text{FDP}(t) \leq \tilde B(t)/R(t)] \geq 0.5 (Hemerik et al., 2022). These approaches yield computationally efficient algorithms (linear time after sorting) and mFDP-controlling procedures, with enhanced flexibility and interpretable adjusted p-values.

4. Competition-Based Adaptive FDP Bounds

In competition-based FDR control (e.g., knockoffs, target-decoy competition), adaptive FDP bounds are constructed via negative-binomial processes. For each hypothesis ii, one defines target and decoy wins, with the label Li=+1L_i = +1 or 1-1. For any cutoff kk, denote DkD_k the decoy count and TkT_k the target wins, with FDP Qk=Vk/(Tk1)Q_k = V_k/(T_k \vee 1).

Two principal adaptive upper bands for FDP are developed:

  • Standardized band (TDC-SB):

ξdSB=Bd+zΔ1γB(1+B)d,\xi_d^{SB} = B d + z^{1-\gamma}_\Delta \sqrt{B(1+B)d},

with UdNB(d,R)U_d \sim \text{NB}(d, R) and zΔ1γz^{1-\gamma}_\Delta the quantile of the standardized process.

  • Uniform band (TDC-UB):

ξdUB=βd1uγ(Δ),\xi_d^{UB} = \beta_d^{1-u_\gamma(\Delta)},

where uγ(Δ)u_\gamma(\Delta) is the largest threshold guaranteeing P[d:U~du]γP[\exists d : \tilde U_d \leq u] \leq \gamma.

Both bands are shown to adapt tightly to the data (especially for small decoy counts) and empirically outperform the Katsevich–Ramdas bound in diverse settings, maintaining finite-sample exactness and scalability (Ebadi et al., 2023).

5. High-Dimensional Regression: Adaptive FDP Estimation

In high-dimensional regression, the de-sparsified Lasso (“DLasso”) estimator provides adaptive FDP estimation for variable selection. The statistic

zj=nb^j/[σΩ^jj],z_j = \sqrt{n}\hat b_j / [\sigma \sqrt{\hat \Omega_{jj}}],

is used to rank predictors. The plug-in FDP estimator

FDP^(t)=2pΦ(t)Rz(t)1,\widehat{\text{FDP}}(t) = \frac{2p\Phi(-t)}{R_z(t) \vee 1},

approximates the expected number of false discoveries via Normal tail probabilities. The threshold tα=inf{t:FDP^(t)α}t_\alpha = \inf\{t : \widehat{\text{FDP}}(t) \leq \alpha\} yields consistent FDP control under standard design and sparsity assumptions (Jeng et al., 2018).

6. Adaptive Estimation under Differential Privacy Constraints

Federated differential privacy (FDP) introduces new challenges for adaptation. In federated density estimation, servers add carefully tuned exponential noise in a multiscale oscillation norm to wavelet coefficient estimates, yielding (ε,0)-FDP privacy: Tlk(j)=f^lk(j)+Vlk(j),Vlk(j) sampled from density exp(ϵnvVL).T_{lk}^{(j)} = \hat f_{lk}^{(j)} + V_{lk}^{(j)}, \quad V_{lk}^{(j)} \text{ sampled from density } \propto \exp(-\epsilon n \|v\|_{V_{L^*}}). Post-processing via block thresholding produces estimators attaining sharp adaptive rates: Eff^f22N2α/(2α+1)+(logNmn2ϵ2)2α/(2α+2),E_f\|\hat f - f\|_2^2 \lesssim N^{-2\alpha/(2\alpha+1)} + \left( \frac{\log N}{mn^2\epsilon^2} \right)^{2\alpha/(2\alpha+2)}, with analogous bounds for pointwise estimation. Lower bounds demonstrate that adaptation in global risk under FDP incurs an intrinsic logN\log N factor in the privacy term, and the pointwise risk incurs two such factors, reflecting the unavoidable privacy-adaptation trade-off (Cai et al., 16 Dec 2025).

7. Comparative Properties and Empirical Performance

Adaptive FDP estimators expand on classical FDR control by:

  • Providing exact finite-sample moment formulas for the FDP (Ditzhaus et al., 2018).
  • Enabling post hoc selection of target FDP levels with simultaneous envelope guarantees (Hemerik et al., 2022).
  • Offering tighter bounds, especially at low decoy counts, when benchmarking against existing simultaneous FDP-control bands (Ebadi et al., 2023).
  • Achieving consistency and minimax rates, even in challenging regimes (high-dimensional, privacy constrained), where classical approaches may suffer conservatism or inefficiency (Jeng et al., 2018, Cai et al., 16 Dec 2025).

Empirical evaluations confirm that adaptive procedures maintain nominal control and improve power or tightness against competitors. A plausible implication is that envelope-based or negative-binomial-process bounds can yield sharper guarantees and computational tractability, even in the presence of dependence or unknown null proportions.


In summary, adaptive FDP estimators encompass a broad family of data-driven procedures for simultaneous multiple testing, variable selection, competition frameworks, and privacy-preserving inference. They achieve finite-sample validity, enhanced flexibility, and improved performance relative to traditional mean-FDP control, with rigorous consistency and minimax adaptivity in diverse statistical models.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Adaptive FDP Estimator.