Reject Option Frameworks

Updated 8 June 2026

Reject Option Frameworks are techniques that enable models to abstain from predictions when confidence is low, effectively managing selective risk and controlled coverage.
They employ methods like surrogate losses, SVM, deep architectures (e.g., SelectiveNet), and conformal prediction with robust statistical guarantees to balance error and abstention.
These frameworks are vital in applications such as medical diagnostics, autonomous systems, and OOD detection where precise risk–coverage trade-offs are crucial.

A reject option framework in machine learning formalizes the principle that a predictive model may abstain from making a prediction on inputs where the confidence in any output is insufficient relative to the application’s risk or coverage requirements. Reject option strategies have been studied extensively in classical pattern recognition, modern deep learning, structured prediction, regression, crowdsourcing, time-series inference, and out-of-distribution (OOD) detection. Rigorous frameworks specify both the formal objectives—selective risk, controlled coverage, or error-abstention trade-offs—and tractable algorithms with guaranteed or empirically validated performance.

1. Formal Problem Setup: Selective Prediction, Risk, and Optimality

The core reject option paradigm extends standard predictive modeling by equipping the predictor with an abstention (or reject/abstain) action. Formally, given inputs $X\in\mathcal{X}$ , labels $Y\in\mathcal{Y}$ , a model predicts $h(x)\in\mathcal{Y}$ or abstains (denoted $\mathcal{R}$ or $0$). For classification, three related but distinct models arise (Franc et al., 2021):

Cost-based model:

Minimize

$R_B(h,c) = \mathbb{E}[\ell(y,h(x))c(x) + \rho(1-c(x))]$

where $\ell$ is a standard loss (e.g., 0-1), $c(x)\in[0,1]$ is the acceptance probability, and $\rho$ is the cost of rejection (with $\rho<1$ ).

Bounded-improvement model:

Maximize coverage under a selective risk constraint

$Y\in\mathcal{Y}$ 0

where $Y\in\mathcal{Y}$ 1 and $Y\in\mathcal{Y}$ 2.

Bounded-coverage model:

Minimize selective risk for a fixed coverage

$Y\in\mathcal{Y}$ 3

In all cases, under mild conditions, the Bayes-optimal classifier is $Y\in\mathcal{Y}$ 4, with the reject/acceptance function $Y\in\mathcal{Y}$ 5 reducing to thresholding the (conditional) risk $Y\in\mathcal{Y}$ 6 (Franc et al., 2021):

$Y\in\mathcal{Y}$ 7

where $Y\in\mathcal{Y}$ 8, threshold $Y\in\mathcal{Y}$ 9 determined by coverage or risk constraints, and $h(x)\in\mathcal{Y}$ 0 ensures equality on the constraint boundary.

For multiclass settings, the abstain loss $h(x)\in\mathcal{Y}$ 1 and corresponding optimal rule generalize Chow’s rule (Ramaswamy et al., 2015):

$h(x)\in\mathcal{Y}$ 2

with $h(x)\in\mathcal{Y}$ 3 ensuring a nontrivial reject region (Ramaswamy et al., 2015, Franc et al., 2021). For regression and distributional prediction, analogous rules apply, e.g., thresholding conditional variance or CRPS-based entropy (Denis et al., 2020, Zaoui et al., 31 Mar 2025).

2. Surrogate Losses, Statistical Guarantees, and Learning Strategies

Minimization of non-convex, discontinuous “0–1–d” losses underlying reject option formulations is rarely feasible directly. Modern frameworks construct convex-calibrated (or non-convex yet Fisher-consistent) surrogates guaranteeing that population minimizers recover the Bayes-optimal reject classifier (Wegkamp et al., 2012, Kalra et al., 2021, Geifman et al., 2019):

In binary settings, double-hinge and double-sigmoid losses yield tight excess risk bounds for the “0–d–1” abstain loss (Kalra et al., 2021, Wegkamp et al., 2012).
For multiclass, convex surrogates—Crammer–Singer, one-vs-all hinge, binary-encoded predictions—are established as consistent for reject-option risks with $h(x)\in\mathcal{Y}$ 4 (Ramaswamy et al., 2015).
Deep learning: end-to-end discriminative architectures (e.g., SelectiveNet) optimize joint objectives over prediction and selection heads, regularized to enforce desired risk–coverage regimes (Geifman et al., 2019).
Regression and distributional tasks admit abstention via thresholding predictable variance or distributional entropy, with plug-in estimators based on semi-supervised calibration (Denis et al., 2020, Zaoui et al., 31 Mar 2025).
Statistical guarantees include oracle inequalities for excess selective risk, calibration to target rejection coverage within $h(x)\in\mathcal{Y}$ 5 (Wegkamp et al., 2012, Zaoui et al., 31 Mar 2025), generalization error rates for deep surrogates, and robustness to label noise (Kalra et al., 2021).

Empirically, learning a proper uncertainty score $h(x)\in\mathcal{Y}$ 6 (e.g., conditional risk, confidence, or misclassification margin) suffices: thresholding $h(x)\in\mathcal{Y}$ 7 at a calibrated $h(x)\in\mathcal{Y}$ 8 is proven to recover the optimal selective classifier for any consistent estimate (Franc et al., 2021).

3. Algorithmic Implementations and Practical Frameworks

Reject option is realized through a spectrum of algorithms encompassing classical, kernel, deep, and ensemble methods:

Support Vector Machines with Reject Option: In SVMs, the reject region is defined by two parallel hyperplanes $h(x)\in\mathcal{Y}$ 9, with abstention for $\mathcal{R}$ 0. Training can be formulated as a linear program via a calibrated piecewise-linear surrogate (Wegkamp et al., 2012).
Data Replication Method: Transforming the problem into extended binary classification with data replicas facilitates simultaneous estimation of parallel reject thresholds (SVM or NN backends) (Sousa et al., 2010).
Margin-Based Multiclass Approaches: Angle-based multivector coding and “bent” losses enable efficient convex optimization (coordinate descent) with direct reject and refine actions (Zhang et al., 2017).
Prototype-Based Models: Reject strategies include global and cell-local thresholding of membership or similarity scores, optimized using dynamic programming or greedy approximations (Fischer et al., 2015).
SelectiveNet and RISAN: Deep networks explicitly integrate selection functions, learnable thresholds, and instance-specific abstention, with theoretical calibration and robust empirical performance (Geifman et al., 2019, Kalra et al., 2021).
Conformal Prediction: In conformal methods, the prediction set $\mathcal{R}$ 1 is produced, and the model only “accepts” singleton outputs, yielding rigorous error–coverage trade-offs and distribution-free guarantees (Szabadváry et al., 26 Jun 2025).

Key practical insights include the necessity of post-hoc score calibration or quantile selection to ensure target coverage, especially under covariate shift or in the presence of non-uniform uncertainty distributions.

4. Specializations: Crowdsourcing, OOD, and Structured Outputs

Reject option frameworks adapt seamlessly to specialized settings:

Crowdsourcing: Workers can exercise a “skip” (reject) on uncertain microtasks, and optimal aggregation is achieved by weighted majority voting, with weights inversely related to the number of microtasks skipped or completed, robust to spammers via adaptive strategies (Li et al., 2017, Li et al., 2016).
Early Exit/Adaptive Computation: For complex deep classifiers, early exit heads are unified via sequential reject subroutines, with budget-constrained optimization solved via exponential-weight aggregation (Valade et al., 2024).
Out-of-Distribution Detection: Unified reject option analysis for OOD incorporates both classification and distribution-discrepancy scores. Optimal decision surfaces involve thresholding a linear combination of misclassification and OOD-likelihood scores, motivating double-score methods and novel risk metrics at fixed TPR/FPR or precision/recall (Franc et al., 2023).
Time-series/Streaming: In early decision scenarios, ensemble-agreement reject rules provide a robust mechanism for online deferred classification, outperforming standard posterior-threshold schemes (Hatami et al., 2013).

5. Interpretability, Explanation, and Human-in-the-Loop Integration

Explainability of reject option decisions is crucial in high-stakes contexts:

Logic-Based and Abductive Explanations: For linear models with reject regions, minimum-size (optimally compact) abductive explanations (feature sets certifying rejection or classification) can be computed efficiently in $\mathcal{R}$ 2 for non-rejects and via 0–1 ILP for rejects (Fernandes et al., 14 Mar 2026, Filho et al., 2024). These methods provide formal guarantees of correctness and minimality, outperforming heuristic approaches and facilitating real-time or in-the-loop human review.
Interpretability in Deep Networks: Instance-specific rejection thresholds enable saliency analysis, highlighting features underlying abstention decisions (e.g., via Grad-CAM in RISAN) (Kalra et al., 2021).
Human Feedback and Safety: Rejected or ambiguous cases can be routed to specialists, and threshold/coverage selection can be tuned to balance workflow and safety requirements (Filho et al., 2024).

6. Applications, Trade-offs, and Future Directions

Reject option frameworks have been adopted in diverse domains—medical decision systems, autonomous vehicles, financial risk scoring, scientific discovery, and real-time monitoring. The principal operational trade-offs are:

Selective Risk vs Coverage (Risk–Coverage Curve): Analytical and empirical tools (e.g., error–reject and accuracy–reject curves) inform optimal thresholding for specific application costs (Szabadváry et al., 26 Jun 2025, Geifman et al., 2019).
Error Control versus Budgeting: In resource-constrained inference, per-sample abstention mechanisms are optimized under explicit computation or time budgets, extending beyond uncertainty-only criteria (Valade et al., 2024).
Aleatoric vs Epistemic Uncertainty: Recent advances distinguish between irreducible (aleatoric) and data scarcity-driven (epistemic) uncertainties, introducing reject rules that abstain only on inputs with excessive epistemic (data-related) risk, generalizing classical Chow rules and conformal rejection (Franc et al., 6 Nov 2025).
Evaluation and Benchmarking: Proper metrics for OOD and selective prediction must integrate both selective risk and acceptance/rejection fidelity, as formalized in recent work (Franc et al., 2023).

Active research explores extending these frameworks to highly structured outputs, cost-sensitive abstention, domain adaption, and calibration techniques, as well as quantifying the theoretical limits of learnability and adaptation under various models of uncertainty.

References:

(Wegkamp et al., 2012): Support vector machines with a reject option
(Ramaswamy et al., 2015): Consistent Algorithms for Multiclass Classification with a Reject Option
(Franc et al., 2021): Optimal strategies for reject option classifiers
(Zhang et al., 2017): On Reject and Refine Options in Multicategory Classification
(Sousa et al., 2010): The Data Replication Method for the Classification with Reject Option
(Valade et al., 2024): EERO: Early Exit with Reject Option for Efficient Classification with limited budget
(Geifman et al., 2019): SelectiveNet: A Deep Neural Network with an Integrated Reject Option
(Fischer et al., 2015): Optimum Reject Options for Prototype-based Classification
(Kalra et al., 2021): RISAN: Robust Instance Specific Abstention Network
(Szabadváry et al., 26 Jun 2025): Classification with Reject Option: Distribution-free Error Guarantees via Conformal Prediction
(Denis et al., 2020): Regression with reject option and application to kNN
(Zaoui et al., 31 Mar 2025): Distributional regression with reject option
(Franc et al., 6 Nov 2025): Epistemic Reject Option Prediction
(Franc et al., 2023): Reject option models comprising out-of-distribution detection
(Fernandes et al., 14 Mar 2026, Filho et al., 2024): Logic-based and minimum-size explanations for reject option linear models
(Li et al., 2017, Li et al., 2016): Crowdsourcing with reject option and optimal aggregation rules
(Hatami et al., 2013): Classifiers With a Reject Option for Early Time-Series Classification