Risk-Controlling Prediction Sets

Updated 22 June 2026

Risk-Controlling Prediction Sets (RCPS) are set-valued predictive rules that control the probability of exceeding a risk threshold using user-defined, bounded, and monotonic loss functions.
They calibrate predictive models on exchangeable samples by selecting the smallest parameter to ensure finite-sample risk control and maintain nested prediction sets.
RCPS extend traditional conformal prediction to manage broader risk measures, making them applicable for cost-sensitive classification, regression, structured prediction, and robust decision support.

A Risk-Controlling Prediction Set (RCPS) is a set-valued predictive rule that is rigorously calibrated to ensure the probability or risk of undesirable outcomes, measured via a loss function, is controlled at or below a user-specified level. The RCPS framework generalizes conformal prediction—originally designed to guarantee marginal miscoverage (i.e., $P(Y \notin C(X)) \leq \alpha$ )—to guarantee that a broad class of risk measures, not just miscoverage, are below a target level, typically with explicit finite-sample guarantees. RCPS can be instantiated for arbitrary (bounded, monotonic) loss functions and enables uncertainty quantification and error control for a wide variety of downstream tasks, including cost-sensitive classification, multivariate regression, structured prediction, and uncertainty-aware decision support.

1. Formalization and Definitions

Let $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ denote an exchangeable sample from an unknown distribution, where the first $n$ examples are used for calibration, and $(X_{n+1},Y_{n+1})$ is held out for testing. The RCPS construction centers on a family of set-valued predictors $C_\lambda: \mathcal{X} \to 2^{\mathcal{Y}}$ , parameterized by a real-valued $\lambda \in \Lambda$ and assumed to be nested: $\lambda_1 < \lambda_2 \implies C_{\lambda_1}(x) \subseteq C_{\lambda_2}(x)$ for all $x$ .

A loss function $L: \mathcal{Y} \times 2^{\mathcal{Y}} \to [0,B]$ is used to quantify errors, with the monotonicity property $C_1 \subseteq C_2 \implies L(y, C_2) \leq L(y, C_1)$ , ensuring larger sets do not increase loss. The key risk-control property targeted by RCPS is:

$(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 0

where $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 1 is the loss tolerance and $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 2 is the probability of exceeding it (Wang et al., 2023).

2. RCPS Calibration Algorithms

The canonical RCPS algorithm, and its variants for related settings, follow a common calibration logic: identify the smallest $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 3 such that, on the calibration set, the empirical quantile or upper confidence bound (UCB) for risk does not exceed the prescribed level.

For the finite-sample, $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 4-loss-control regime (Wang et al., 2023):

For each candidate $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 5, compute losses $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 6, $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 7.
Let $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 8 denote the $(X_i,Y_i)_{i=1}^{n+1} \sim P_{XY}$ 9 empirical quantile of $n$ 0.
Select $n$ 1.
At prediction time, output $n$ 2.

Key alternative RCPS instantiations include the Hoeffding UCB-based method (Bates et al., 2021, Leterme et al., 2024), anytime-valid (sequential) RCPS using sub-gamma martingale concentration (Hultberg et al., 4 Feb 2026), and selective RCPS for abstaining predictions (Xu et al., 14 Dec 2025).

3. Finite-Sample and High-Probability Guarantees

A central theoretical result is that, under data exchangeability and monotonicity, the output of the RCPS procedure yields finite-sample risk control with explicit probability guarantees.

Finite-Sample Validity (CLCP / nonconformity-quantile RCPS):

$n$ 3

(Wang et al., 2023)

Hoeffding UCB-based RCPS:

For bounded loss and UCB construction,

$n$ 4

(Bates et al., 2021, Leterme et al., 2024)

Anytime-Valid RCPS: For a sequence $n$ 5,

$n$ 6

(Hultberg et al., 4 Feb 2026) These guarantees hold for arbitrary black-box predictors and loss functions that admit a nested family and monotonicity.

4. Relation to Other Predictive Error Control Frameworks

RCPS is closely related but strictly generalizes conformal prediction. Ordinary conformal prediction is recovered as a special case when $n$ 7 is the $n$ 8- $n$ 9 miscoverage loss and $(X_{n+1},Y_{n+1})$ 0. In the literature, conformal risk control (CRC) typically targets control of the expected loss $(X_{n+1},Y_{n+1})$ 1 (mean criterion), whereas RCPS and its CLCP instantiation provide a per-instance or quantile-based guarantee, which is strictly stronger in finite samples for general losses (Wang et al., 2023).

Other RCPS extensions include:

Control of group- or subgroup-specific risks (e.g., SG-RCPS in dose estimation (Fischer et al., 2024)).
Counterfactual risk/harm in human-in-the-loop decision systems (Straitouri et al., 2024).
Selective prediction with a two-stage framework (SCRC (Xu et al., 14 Dec 2025)).
Tail risk and optimized certainty equivalent (OCE) risk (OCE-RCPS (Huang et al., 14 Feb 2026)).
Decision-theoretic robust optimization (e.g., ROCP (Wang et al., 1 Feb 2026), power system operations (Stratigakos et al., 1 Jun 2026)).

5. Empirical Illustrations and Applications

RCPS has been empirically validated in a wide range of settings:

Classification (class-varying cost): On standard UCI datasets with random class-dependent loss functions, RCPS achieves the prescribed empirical loss-exceedance rates and demonstrates the trade-off between loss tolerance $(X_{n+1},Y_{n+1})$ 2, violation probability $(X_{n+1},Y_{n+1})$ 3, and average set size (Wang et al., 2023).
Regression and complex output tasks: Pixel-wise miscoverage for weather forecast postprocessing and image denoising with diffusion models (Teneggi et al., 2023).
Inverse problems: Weak lensing mass mapping, where RCPS intervals are robust but can be conservative with small calibration sets (Leterme et al., 2024).
Human-in-the-loop: Conformal risk control for counterfactual harm reduces the frequency of adverse outcomes in assisted prediction (Straitouri et al., 2024).
Decision support and robust optimization: Decision-calibrated RCPS yields compact and efficient uncertainty sets for robust power and control systems, achieving tight operational reliability targets (Stratigakos et al., 1 Jun 2026, Wang et al., 1 Feb 2026).

6. Practical Usage and Considerations

Implementation of RCPS involves specifying a nested family $(X_{n+1},Y_{n+1})$ 4 (often via thresholding a nonconformity score), computing calibration losses, and selecting $(X_{n+1},Y_{n+1})$ 5 to enforce the risk guarantee. Key considerations include:

Choice of the loss function: RCPS can accommodate any bounded, monotone loss, enabling risk control for custom utility/risk profiles.
Selection of error levels $(X_{n+1},Y_{n+1})$ 6: Increasing either relaxes the guarantee, leading to smaller, more informative prediction sets.
Calibration efficiency: RCPS calibration is straightforward and computationally tractable (grid search or quantile computation). However, with small calibration sets, risk controls can be conservative—hybrid strategies such as cross-validation RCPS (Cohen et al., 2024) and semi-supervised calibration (Einbinder et al., 2024) help mitigate this.
Open questions include extensions to structured output spaces, sharper theoretical bounds under weakened exchangeability, and development of computationally efficient and adaptive online RCPS variants.

7. Extensions, Limitations, and Open Directions

RCPS provides a rigorous, distribution-free framework for predictive uncertainty quantification and error control. Notable extensions include:

Subgroup- and decision-calibrated RCPS for fairness and operational reliability in high-stakes applications (Fischer et al., 2024, Stratigakos et al., 1 Jun 2026).
Anytime/sequential RCPS with time-uniform error guarantees (Xu et al., 2024, Hultberg et al., 4 Feb 2026).
Robustification for non-i.i.d. and time-dependent data using blocking or decoupling (mixing processes) (Lee et al., 2024).
Integration with semi-supervised and debiased calibration for data-limited settings (Einbinder et al., 2024, Farzaneh et al., 4 Sep 2025).
Generalization to risk functionals beyond expectation, such as spectral risk or optimized certainty equivalent (Huang et al., 14 Feb 2026, Eom et al., 2 Jun 2026).

Limitations include conservative calibration in presence of extreme data imbalance or small sample sizes and less flexibility for model-driven adaptive design of the set family. Active research is directed at learning or optimizing the set-nesting structure for tighter risk control, flexible localization (e.g., kernel-based threshold functions in L-ARC (Zecchin et al., 2024)), and further generalizing to adversarial and online regimes.

In summary, Risk-Controlling Prediction Sets form an essential, theoretically principled approach for rigorous, user-customizable risk management in predictive modeling, beyond traditional coverage guarantees, with strong finite-sample control, broad applicability, and a growing suite of extensions for modern machine learning and decision-making tasks (Wang et al., 2023, Bates et al., 2021, Leterme et al., 2024).