Papers
Topics
Authors
Recent
Search
2000 character limit reached

Irreproducible Discovery Rate (IDR)

Updated 7 June 2026
  • IDR is a statistical framework that quantifies reproducibility in replicate high-throughput experiments using a copula-mixture model and local thresholding.
  • It adapts false discovery rate principles to assess signal consistency across replicates, enabling rigorous threshold selection for reliable discoveries.
  • Applications in ChIP-seq and genomic studies demonstrate IDR’s robust calibration and improved power over conventional p-value and single-replicate methods.

The Irreproducible Discovery Rate (IDR) quantifies the reproducibility of findings arising in replicate high-throughput experiments, such as ChIP-seq peak callers or genomic association studies. Conceived as an analog to the false discovery rate (FDR), IDR directly adapts multiple testing principles to the problem of identifying signals consistent across experimental repeats, modeling reproducibility as an empirical measure derived from a copula-mixture model fit to the ranked signal lists produced by replicates. The IDR framework supports principled threshold selection, graphical analysis of replicate agreement, and empirical reproducibility assessment while accommodating arbitrary marginal scoring scales for each replicate (Li et al., 2011, Wei et al., 2013).

1. Foundation and Formal Analogy to FDR

IDR is motivated by the deficiencies of scalar agreement metrics (e.g., correlation, overlap counts) for quantifying replicate consistency in high-throughput data. Drawing inspiration from FDR frameworks, which model test statistics as a mixture of null and alternative distributions to estimate the local false discovery rate,

lfdr(z)=Pr(nullZ=z)=π0f0(z)f(z)\mathsf{lfdr}(z) = \Pr(\text{null} \mid Z=z) = \frac{\pi_0 f_0(z)}{f(z)}

the IDR approach reinterprets each signal ii as arising from an irreproducible (Ki=0K_i=0) or reproducible (Ki=1K_i=1) process. Let Xi1,Xi2X_{i1}, X_{i2} denote the scores/peak heights for signal ii in two replicates, and (Ui,Vi)(U_i, V_i) be their probability-integral transforms under the respective marginal empirical distributions, so that U,VUnif[0,1]U, V \sim \text{Unif}[0,1] marginally.

IDR is then defined at the level of rejection regions R[0,1]2R \subset [0,1]^2 as

IDR(R)=Pr(Ki=0(Ui,Vi)R)=π0Rf0(u,v)dudvR[π0f0(u,v)+π1f1(u,v)]dudv\mathsf{IDR}(R) = \Pr(K_i=0 \mid (U_i, V_i) \in R) = \frac{\pi_0 \int_R f_0(u,v) du dv}{\int_R [\pi_0 f_0(u,v) + \pi_1 f_1(u,v)] du dv}

with local IDR for a particular observed pair ii0 given by

ii1

These expressions are directly analogous to the local FDR and overall FDR in multiple testing scenarios, replacing the notion of a null with “irreproducible” signals and of an alternative with “reproducible” ones (Li et al., 2011).

2. Copula Mixture Model of Replicate Rankings

To capture the dependence structure between replicates, the distribution of ii2 is modeled as a two-component copula mixture: ii3 where

  • ii4 is typically the independence copula (ii5), representing noise or signals that do not agree,
  • ii6 is a bivariate (Gaussian) copula density with positive correlation ii7, representing reproducible signals.

Latent variables ii8 are modeled as jointly normal, with parameters ii9 for Ki=0K_i=00. The marginal transformation Ki=0K_i=01 maps these to unit interval margins, combining the reproducible and irreproducible components in the mixture proportion Ki=0K_i=02, Ki=0K_i=03, generated as Ki=0K_i=04. This construction enables flexible modeling of signal consistency even under unknown or noncomparable replicate scoring scales (Li et al., 2011, Wei et al., 2013).

3. Parameter Estimation and Thresholding

Model fitting proceeds via an EM-type algorithm, integrating empirical marginal transformation into the iterative procedure:

  • Pseudo-data: At each EM round, replicate values Ki=0K_i=05 are ranked to uniform scores, then mapped to latent normal space via the mixture marginal Ki=0K_i=06 using the current parameter estimates.
  • E-step: Posterior mixture responsibilities Ki=0K_i=07 are updated based on the Gaussian copula densities.
  • M-step: Parameters Ki=0K_i=08 are updated via weighted empirical means and variances.

The process alternates E and M steps, updating pseudo-data at each iteration, and typically converges reliably. Once fit, local IDR values Ki=0K_i=09 are computed. Signals are then ranked by increasing local IDR, and for each Ki=1K_i=10,

Ki=1K_i=11

Threshold selection proceeds by choosing the largest Ki=1K_i=12 so that Ki=1K_i=13 for a chosen level Ki=1K_i=14 (e.g., Ki=1K_i=15), analogous to the Benjamini-Hochberg step-up for FDR control (Li et al., 2011).

4. Assessment via Correspondence Curves

To visualize loss of reproducibility as a function of rank, the correspondence curve is defined as

Ki=1K_i=16

In the population, Ki=1K_i=17. For perfect dependence, Ki=1K_i=18; for independence, Ki=1K_i=19. The derivative Xi1,Xi2X_{i1}, X_{i2}0 highlights breakpoints at which replicate agreement abruptly drops, and thus helps to guide appropriate cutoff selection for calling reproducible signals. This graphical assessment complements the formal mixture modeling and thresholding procedures of IDR (Li et al., 2011).

5. Model Extensions: Survival Copula Mixture

The original IDR framework is limited to loci appearing in both replicate lists ("overlap-only"), which can lead to overestimated reproducibility when overlap is small but concordance within the overlap is high. The survival copula mixture model (SCOP) reformulates the two-list comparison as a bivariate survival problem, allowing for the inclusion of censored (i.e., non-overlapping) loci:

  • Loci unique to one list are treated as right-censored observations, with truncation at the cutoff of the other list.
  • For each locus, observed data Xi1,Xi2X_{i1}, X_{i2}1 is used in a mixture likelihood, combining survival and density contributions according to censoring pattern.
  • The model fits marginal survival curves via a weighted Kaplan–Meier estimate within each mixture component and updates the copula parameters accordingly via EM.

After convergence, a local survival-IDR can be computed for all loci, including those absent from either list. This approach restores power for IDR estimation and corrects the misleadingly low irreproducibility estimates produced by the overlap-only approach, especially as overlap decreases (Wei et al., 2013).

6. Empirical Performance and Applications

Simulation studies demonstrate the calibration and efficacy of IDR-based thresholding: local IDR thresholds are well calibrated (IDR at Xi1,Xi2X_{i1}, X_{i2}2 nominal matches empirical irreproducible-call proportions) except in the presence of complex artifact structure, and IDR-based call sets trade off between true and false discoveries more favorably than single-replicate Xi1,Xi2X_{i1}, X_{i2}3-values or conventional p-value combination methods. In real ChIP-seq applications, IDR reproducibility profiles depend strongly on the quality of the underlying peak callers: high-reproducibility callers yield Xi1,Xi2X_{i1}, X_{i2}4, Xi1,Xi2X_{i1}, X_{i2}5, and thousands of reproducible calls at IDR Xi1,Xi2X_{i1}, X_{i2}6, with IDR-selected peaks exhibiting strong enrichment for high-confidence motif occurrences (Li et al., 2011).

Use of the survival-copula (SCOP) model further rectifies underestimation of irreproducibility in cases where overlap is small. For ENCODE replicates with ~20–40% overlap, overlap-only IDR yielded Xi1,Xi2X_{i1}, X_{i2}7 irreproducibility, while SCOP analysis indicated Xi1,Xi2X_{i1}, X_{i2}8, aligning better with empirical presence of irreproducible loci (Wei et al., 2013).

7. Limitations, Theoretical Properties, and Practical Considerations

Theoretical analysis demonstrates that, under correct model specification and i.i.d. sampling, step-up selection by local IDR is asymptotically optimal for maximizing reproducible discovery yield at controlled IDR levels (in the sense of Sun & Cai, 2007). The rank-based estimation grants near-parametric efficiency under continuous marginals.

Limitations include:

  • Replicate independence: systemic biases can generate artifactual reproducibility.
  • Signal dependence: most models assume i.i.d. signals; spatial or other correlations among signals can reduce accuracy but empirical performance is robust unless dependence is strong.
  • Model misspecification: two-component mixtures may misclassify signals if genuine reproducibility spans more than two strata; Xi1,Xi2X_{i1}, X_{i2}9-component models can be fitted but complicate interpretation.
  • Small-n scenarios: poor mixture separation or small sample sizes can impair convergence or result in misassigned components; careful assessment of model fit and estimated parameters is essential.

Empirical application supports IDR as a principled, scale-free, statistically grounded method for quantifying and controlling reproducibility in high-throughput experiments (Li et al., 2011, Wei et al., 2013).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Irreproducible Discovery Rate (IDR).