SmallML: Efficient Conformal Prediction

Updated 19 November 2025

SmallML is a framework for constructing prediction sets that guarantee exact finite-sample marginal coverage while minimizing set size.
It utilizes innovative methods like SAPS and adaptive thresholding to optimize nonconformity scores and improve model calibration.
Empirical results on benchmarks like ImageNet demonstrate significant reductions in average set sizes, enhancing efficiency over classical approaches.

SmallML (Small-Set Marginal Learning) refers to the suite of methods, theory, and algorithmic design principles surrounding minimization of conformal prediction set size (often called "efficiency") subject to exact finite-sample marginal coverage constraints. The central object of paper is the prediction set $C(x)$ : given a pretrained classifier or regressor, and under a minimal exchangeability or i.i.d. assumption on calibration and test data, construct $C(x)$ such that %%%%2%%%% for a user-specified miscoverage rate $\alpha$ , while maintaining $|C(x)|$ as small as possible on average. Recent advances, particularly in deep learning and large-scale benchmarks, have focused on refined score constructions, optimization of nonconformity functions, adaptive thresholding, and incorporation of model calibration and probabilistic structure—all towards achieving set sizes orders of magnitude smaller than those produced by prior methods.

1. Principles of Marginal Validity and Set Efficiency

The conformal prediction framework requires only the exchangeability of calibration and test data, making no assumptions about the accuracy or calibration of the underlying model. For a pre-trained model $\hat{\pi}: \mathcal X \to \Delta^{K-1}$ (class probabilities), a held-out calibration set $\{(x_i, y_i)\}_{i=1}^n$ , and a user-chosen miscoverage level $\alpha$ , the classical workflow computes non-conformity scores $S(x_i, y_i)$ , sets a threshold $\tau$ as the empirical $(1-\alpha)$ -quantile, and defines $C(x_{n+1}) = \{ y : S(x_{n+1}, y) \le \tau \}$ (Huang et al., 2023). By construction,

$\Pr[Y \in C(X)] \ge 1-\alpha.$

All subsequent developments—whether relying on softmax scores, instance difficulty, higher-order uncertainty, or distributional estimates—are constrained by this foundational guarantee. The challenge addressed by SmallML is to design $S$ and calibrate $\tau$ such that $C(x)$ is as small as possible for typical $x$ .

2. Sorted Adaptive Prediction Sets (SAPS) and Label Ranking

The Sorted Adaptive Prediction Sets (SAPS) framework illustrates a key innovation in SmallML: minimize dependence on miscalibrated probability tails by discarding all but the maximum predicted probability. For $K$ -class deep classifiers, SAPS defines for each $(x, y)$ and random $u \sim \mathrm{Uniform}[0,1]$ : $S_{\mathrm{SAPS}}(x, y, u; \hat{\pi}) = \begin{cases} u \cdot \hat{\pi}_{\max}(x) & o(y, \hat{\pi}(x)) = 1 \ \hat{\pi}_{\max}(x) + (o(y, \hat{\pi}(x)) - 2 + u)\lambda & o(y, \hat{\pi}(x)) \geq 2 \end{cases}$ where $o(y, \hat{\pi}(x))$ is the rank of label $y$ in descending sorted predictions, and $\lambda>0$ is a user-tuned hyperparameter (Huang et al., 2023). Only the maximum softmax value is retained; all other probabilistic information is replaced by $\lambda$ . The conformal set is determined by thresholding $S_{\mathrm{SAPS}}$ at the calibrated $\tau$ . This design yields:

Average set sizes $\sim1/7$ those of classical APS on ImageNet (APS: 20.95, SAPS: 2.98 at $\alpha=0.1$ ).
Consistent marginal coverage at the prescribed $1-\alpha$ level.
Smaller, instance-adaptive sets—"easy" examples with high confidence receive smaller $C(x)$ , while "hard" instances expand accordingly.
Uniformly improved conditional coverage (ESCV) versus RAPS and APS.

Experiments on ImageNet, CIFAR-100, and CIFAR-10 consistently find SAPS to outperform alternatives in size and conditional error, with stable performance over reasonable calibration set sizes and hyperparameter choices.

3. Theoretical Guarantees and Coverage Properties

SmallML methods such as SAPS preserve exact marginal validity by standard conformal theory: if the set construction is "nested" (set size increases monotonically as the acceptance threshold grows laxer) and the threshold $\tau$ is chosen as the empirical $(1-\alpha)$ -quantile of $S(x_i, y_i)$ , then for all distributions: [ \Pr[Y \in C_{1-\alpha}(X)] \geq

PDF Markdown Chat (Pro)

References (1)

Conformal Prediction for Deep Classifier via Label Ranking (2023)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to SmallML.

SmallML: Efficient Conformal Prediction

1. Principles of Marginal Validity and Set Efficiency

2. Sorted Adaptive Prediction Sets (SAPS) and Label Ranking

3. Theoretical Guarantees and Coverage Properties

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics