Papers
Topics
Authors
Recent
2000 character limit reached

Rank Conditional Coverage (RCC)

Updated 30 December 2025
  • Rank Conditional Coverage (RCC) is defined as the probability that the true parameter lies within a constructed confidence interval given its empirical rank.
  • RCC addresses the under-coverage of conventional marginal intervals by building rank-specific intervals that maintain nominal coverage for extreme estimates.
  • Bootstrap methods, both parametric and non-parametric, are used to estimate bias distributions, making RCC robust for high-dimensional inference and predictive applications.

Rank Conditional Coverage (RCC) is a statistical framework for evaluating and constructing confidence intervals in the context of large-scale inference, with a particular focus on the coverage properties conditional on the empirical ranking of parameter estimates or conformity scores. RCC provides an explicit answer to the well-documented failure of marginal confidence intervals to maintain nominal coverage rates at the ranks of most scientific interest—that is, for the most extreme or “significant” estimates. Recent developments extend RCC to predictive set construction via rectified conformal prediction. The RCC concept addresses high-dimensional problems in which multiple parameters, tests, or predictions must be jointly analyzed, and selection or reporting bias poses a major challenge to reliable inference (Morrison et al., 2017, Plassier et al., 22 Feb 2025).

1. Formal Definition of Rank Conditional Coverage

Let θ1,,θp\theta_1, \dots, \theta_p be parameters of interest, with point estimates θ^1,,θ^p\hat{\theta}_1, \dots, \hat{\theta}_p. Estimates are ranked by significance (e.g., by absolute tt-statistic), with s(i)s(i) denoting the index of the ii-th most significant parameter so that θ^s(1)θ^s(2)θ^s(p)|\hat{\theta}_{s(1)}| \ge |\hat{\theta}_{s(2)}| \ge \cdots \ge |\hat{\theta}_{s(p)}|. The Rank Conditional Coverage at rank ii is

RCC(i)P[θs(i)CIs(i)]\mathrm{RCC}(i) \equiv \mathbb{P}\big[\theta_{s(i)} \in \mathrm{CI}_{s(i)}\big]

where CIs(i)\mathrm{CI}_{s(i)} is the confidence set or interval for the ii-th ranked estimate. RCC can be equivalently expressed as

RCC(i)=j=1pP[θjCIjs(i)=j]P[s(i)=j].\mathrm{RCC}(i) = \sum_{j=1}^p \mathbb{P}\big[\theta_j \in \mathrm{CI}_j \mid s(i)=j\big]\,\mathbb{P}[s(i)=j].

RCC(i) thus gives the expected coverage rate specifically at rank ii over repeated sampling (Morrison et al., 2017).

2. Motivations: Marginal Coverage Failure and the Superiority of RCC

Conventional 1α1-\alpha marginal confidence intervals are designed so that

1pj=1pP(θjCIj)=1α\frac{1}{p}\sum_{j=1}^p \mathbb{P}(\theta_j \in \mathrm{CI}_j) = 1-\alpha

but this marginal guarantee masks a pronounced under-coverage for the most extreme (top-ranked) estimates and over-coverage for typical or median ones. When scientific or reporting interest is focused on the top kk estimates—e.g., in biomarker discovery or variable selection—the realized coverage among those selected ranks may fall well below 1α1-\alpha.

Selection-adjusted procedures (e.g., False Coverage-Statement Rate (FCR) control) address average coverage among selected parameters, typically by inflating all intervals, but still produce substantial undercoverage among the most extreme ranks. By shifting the criterion to RCC(i) for all ii, procedures can guarantee, rank-by-rank, that the observed coverage matches the nominal level (Morrison et al., 2017).

3. Construction of RCC-Controlled Intervals

The central methodological innovation behind RCC is to build intervals that achieve asymptotic 1α1-\alpha coverage at each rank. This is operationalized via the estimation of the rank-specific bias distribution:

δ[i]θ^s(i)θs(i)\delta_{[i]} \equiv \hat{\theta}_{s(i)} - \theta_{s(i)}

Denoting its cumulative distribution by H[i]H_{[i]}, the (oracle) RCC-exact interval is

CIs(i)exact=[θ^s(i)H[i]1(1α/2), θ^s(i)H[i]1(α/2)]\mathrm{CI}_{s(i)}^{\mathrm{exact}} = \big[\hat{\theta}_{s(i)} - H_{[i]}^{-1}(1-\alpha/2),\ \hat{\theta}_{s(i)} - H_{[i]}^{-1}(\alpha/2)\big]

so that P(θs(i)CIs(i)exact)=1α\mathbb{P}(\theta_{s(i)} \in \mathrm{CI}_{s(i)}^{\mathrm{exact}}) = 1-\alpha for every ii in finite samples.

Since H[i]H_{[i]} is unknown, it is estimated by bootstrap (parametric or non-parametric):

  • Parametric bootstrap: If θ^Np(θ,Σ)\hat{\theta} \sim N_p(\theta, \Sigma) or the θ^jN(θj,σj2)\hat{\theta}_j \sim N(\theta_j, \sigma_j^2) independently, simulate θ(k)\theta^{(k)}, rerank, and compute δ[i](k)\delta_{[i]}^{(k)}. Use empirical quantiles for interval endpoints.
  • Non-parametric bootstrap: Generate bootstrap datasets, re-estimate θ^(k)\hat{\theta}^{(k)}, rerank, and compute δ[i](k)\delta_{[i]}^{(k)}.

This bootstrap approach yields intervals of the form

CIs(i)boot=[θ^s(i)H^[i]1(1α/2), θ^s(i)H^[i]1(α/2)]\mathrm{CI}_{s(i)}^{\text{boot}} = [\hat{\theta}_{s(i)} - \hat{H}_{[i]}^{-1}(1-\alpha/2),\ \hat{\theta}_{s(i)} - \hat{H}_{[i]}^{-1}(\alpha/2)]

which, under standard regularity (consistent bootstrap law), asymptotically achieve RCC(i) =1α= 1-\alpha simultaneously for all ii (Morrison et al., 2017).

4. Theoretical Properties and Implications

Oracle RCC intervals satisfy RCC(i)=1α\mathrm{RCC}(i)=1-\alpha in finite samples. Boostrap-based intervals achieve this property asymptotically uniformly over ii, i.e.,

sup1ipP(θs(i)CIs(i)boot)(1α)0\sup_{1 \leq i \leq p} |\mathbb{P}(\theta_{s(i)} \in \mathrm{CI}_{s(i)}^{\text{boot}}) - (1-\alpha)| \to 0

as the number of bootstrap samples KK \to \infty and nn \to \infty. An important corollary is that any procedure reporting the top rr estimates will have overall FCR α\leq \alpha provided RCC(i) 1α\leq 1-\alpha for all ii, making RCC control a pointwise strengthening of FCR control.

Simulation studies further demonstrate that RCC intervals uniformly maintain target coverage at all ranks and outperform both marginal and FCR-adjusted intervals, especially at extremes where miscoverage is most severe (Morrison et al., 2017).

5. RCC in Predictive Inference and Rectified Conformal Prediction

In predictive inference, the concept of RCC has been adopted to address analogous failures of conditional coverage in conformal prediction frameworks. Classical split conformal prediction guarantees marginal coverage P{Yn+1Cα(Xn+1)}1α\mathbb P\{Y_{n+1} \in \mathcal C_\alpha(X_{n+1})\} \geq 1-\alpha, but may provide sub-nominal coverage over subsets defined by the rank of conformity scores.

Recent work introduces an explicit score-rectification mechanism: via regression, estimate the conditional (1α)(1-\alpha)-quantile of conformity scores τ^(x)\widehat\tau(x), and transform raw scores uu as V~(x,y)=fτ^(x)1(V(x,y))\tilde V(x,y) = f_{\widehat\tau(x)}^{-1}(V(x,y)) for a monotonic family ftf_t. Applying ordinary split conformal prediction to these rectified scores ensures coverage that is nearly uniform both marginally and over strata defined by the empirical rank of test conformity scores—i.e., RCC (Plassier et al., 22 Feb 2025).

Theoretical bounds confirm that the resulting coverage conditional on covariates, and thus conditional on rank strata, approaches 1α1-\alpha provided the quantile regression is accurate. Empirical studies in multi-output prediction highlight that RCC-conformal methods reduce the maximal conditional coverage error compared to non-RCC approaches (Plassier et al., 22 Feb 2025).

6. Software Implementations and Illustrative Examples

The R package “rcc” implements both parametric and non-parametric bootstrap methods for RCC interval construction. The package provides utilities for ranking by signed or absolute test statistics and outputs rank-ordered intervals. Basic usage includes:

1
2
ci_par <- par_bs_ci(est = theta.hat, se  = se.hat, level = 0.90, nboot = 1000)
ci_np  <- nonpar_bs_ci(data = data, estFUN = estFUN, level = 0.90, nboot = 500)
Comparative simulation studies, including independent normals, block-correlated regression models, and treatment-effect estimation across biomarker cutpoints, demonstrate that RCC intervals track the oracle performance and maintain near-nominal coverage across all ranks, even in the presence of strong correlation and selection (Morrison et al., 2017).

7. Practical Considerations, Extensions, and Current Research

A practical guideline for achieving RCC in predictive inference is to choose a meaningful conformity score V(x,y)V(x,y), split calibration data for quantile regression, estimate the conditional quantile τ^(x)\widehat\tau(x), apply the rectification transformation (additive or multiplicative), and run split conformal prediction on rectified scores. This recipe is model- and score-agnostic, and applies equally to multi-output and structured prediction tasks, provided a scalar conformity score can be evaluated (Plassier et al., 22 Feb 2025).

RCC has been shown to imply stronger guarantees than FCR for selection procedures, aligns coverage properties with scientific usage (publication of top-ranked findings), and is extensible to nonparametric, correlated, and structured inference settings. Ongoing investigations include the statistical and computational tradeoffs associated with complex quantile regression estimators for rectification, and the precise characterization of RCC in dependent or high-dimensional settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Rank Conditional Coverage (RCC).