SmallML: Efficient Conformal Prediction
- SmallML is a framework for constructing prediction sets that guarantee exact finite-sample marginal coverage while minimizing set size.
- It utilizes innovative methods like SAPS and adaptive thresholding to optimize nonconformity scores and improve model calibration.
- Empirical results on benchmarks like ImageNet demonstrate significant reductions in average set sizes, enhancing efficiency over classical approaches.
SmallML (Small-Set Marginal Learning) refers to the suite of methods, theory, and algorithmic design principles surrounding minimization of conformal prediction set size (often called "efficiency") subject to exact finite-sample marginal coverage constraints. The central object of paper is the prediction set : given a pretrained classifier or regressor, and under a minimal exchangeability or i.i.d. assumption on calibration and test data, construct such that %%%%2%%%% for a user-specified miscoverage rate , while maintaining as small as possible on average. Recent advances, particularly in deep learning and large-scale benchmarks, have focused on refined score constructions, optimization of nonconformity functions, adaptive thresholding, and incorporation of model calibration and probabilistic structure—all towards achieving set sizes orders of magnitude smaller than those produced by prior methods.
1. Principles of Marginal Validity and Set Efficiency
The conformal prediction framework requires only the exchangeability of calibration and test data, making no assumptions about the accuracy or calibration of the underlying model. For a pre-trained model (class probabilities), a held-out calibration set , and a user-chosen miscoverage level , the classical workflow computes non-conformity scores , sets a threshold as the empirical -quantile, and defines (Huang et al., 2023). By construction,
All subsequent developments—whether relying on softmax scores, instance difficulty, higher-order uncertainty, or distributional estimates—are constrained by this foundational guarantee. The challenge addressed by SmallML is to design and calibrate such that is as small as possible for typical .
2. Sorted Adaptive Prediction Sets (SAPS) and Label Ranking
The Sorted Adaptive Prediction Sets (SAPS) framework illustrates a key innovation in SmallML: minimize dependence on miscalibrated probability tails by discarding all but the maximum predicted probability. For -class deep classifiers, SAPS defines for each and random : where is the rank of label in descending sorted predictions, and is a user-tuned hyperparameter (Huang et al., 2023). Only the maximum softmax value is retained; all other probabilistic information is replaced by . The conformal set is determined by thresholding at the calibrated . This design yields:
- Average set sizes those of classical APS on ImageNet (APS: 20.95, SAPS: 2.98 at ).
- Consistent marginal coverage at the prescribed level.
- Smaller, instance-adaptive sets—"easy" examples with high confidence receive smaller , while "hard" instances expand accordingly.
- Uniformly improved conditional coverage (ESCV) versus RAPS and APS.
Experiments on ImageNet, CIFAR-100, and CIFAR-10 consistently find SAPS to outperform alternatives in size and conditional error, with stable performance over reasonable calibration set sizes and hyperparameter choices.
3. Theoretical Guarantees and Coverage Properties
SmallML methods such as SAPS preserve exact marginal validity by standard conformal theory: if the set construction is "nested" (set size increases monotonically as the acceptance threshold grows laxer) and the threshold is chosen as the empirical -quantile of , then for all distributions: [ \Pr[Y \in C_{1-\alpha}(X)] \geq