SAFE-KD: Risk-Controlled Early Exit for Vision Models
- The framework SAFE-KD is a universal early-exit system that combines hierarchical knowledge distillation with conformal risk control to guarantee statistically bounded selective risk.
- It attaches intermediate classifier exits to any vision backbone (CNN or ViT) and employs decoupled knowledge distillation alongside consistency regularization for calibrated risk control.
- Empirical results show up to a 45% reduction in expected inference depth while maintaining or surpassing full-inference accuracy, ensuring efficiency and robustness.
SAFE-KD offers a universal, risk-controlled early-exit framework for modern vision backbones, combining hierarchical knowledge distillation with conformal risk control (CRC) to achieve statistically guaranteed bounds on selective misclassification risk for early-exit architectures. It enables substantial reductions in inference cost via early stopping for "easy" samples, while maintaining user-specified upper bounds on misclassification risk at each exit, calibrated on finite data. SAFE-KD is model-agnostic and deploys on a variety of convolutional (CNN) and transformer-based (ViT) image models (Khazem, 3 Feb 2026).
1. Architecture and Components
SAFE-KD is structured as a lightweight "wrapper" atop any standard vision backbone, supporting both CNNs and Vision Transformers. Its architecture comprises:
- Base Backbone: Any pretrained or trainable vision model (e.g., ResNet, ConvNeXt, ViT, Swin).
- Intermediate Exit Heads: At select depths, SAFE-KD attaches classifiers producing logits (), class probabilities , and a confidence score (typically, Maximum Softmax Probability ). For CNNs, exits use global average pooling and an optional MLP before a fully connected (FC) layer; for ViTs, exits use CLS or mean token pooling, optional LayerNorm, then FC.
- Teacher Network: An Exponential Moving Average (EMA) of the full model serves as the teacher for knowledge transfer.
This configuration allows SAFE-KD to operate agnostically across architectures, minimally increasing inference overhead.
2. Decoupled Knowledge Distillation and Consistency
Training leverages hierarchical Decoupled Knowledge Distillation (DKD), coupled with deep-to-shallow consistency regularization:
- DKD Loss: For each exit , knowledge is distilled from the teacher logits to using a split-KL loss:
where and are the (teacher, student) probabilities on the target class (ground-truth label), and and are normalized distributions over all non-target classes.
- Consistency Regularization: To align intermediate exits with the final head, SAFE-KD adds
with weighting , regularizing posterior agreement between each intermediate exit and the ultimate exit ().
- Total Loss: For weights summing to $1$, full training minimizes:
where is a scaling factor for DKD.
This hierarchical approach increases calibration, depth-to-exit consistency, and maintains high accuracy at all exits.
3. Conformal Risk Control for Early-Exit Thresholds
At inference, SAFE-KD employs Conformal Risk Control (CRC) to set data-driven confidence thresholds at each exit, guaranteeing a user-specified selective risk:
- Nonconformity Score: at exit .
- Acceptance Set: .
- Selective Misclassification Risk: .
- Threshold Calibration: Using a held-out calibration set , thresholds are chosen so the conformal upper bound:
does not exceed the desired risk level . Here .
CRC, under the exchangeability assumption, ensures
for each exit, providing finite-sample statistical guarantees.
4. Safe Inference Policy and Practical Deployment
At test time, early exit is governed by the following procedure:
- Proceed through exits , checking at each if .
- The first such is used for prediction. If none, inference proceeds to the final exit .
- For every exit , the empirical misclassification risk among samples exiting there is guaranteed not to exceed (up to sampling correction).
This allows the system designer to select according to operational requirements, trading off computational savings for tightly controlled selective risk. All calibration is based on a held-out set.
5. Empirical Evaluation and Results
SAFE-KD has been empirically validated across six architectures (ResNet-50, MobileNetV3-S, EfficientNet-B0, ConvNeXt-T, ViT-S, Swin-T) and multiple image datasets (CIFAR-10/100, STL-10, Pets, Flowers102, Aircraft), delivering:
- Compute-Accuracy Trade-offs: At risk, SAFE-KD achieves $40$-- lower expected depth while matching or surpassing full-inference accuracy. Baseline methods (fixed MSP or entropy thresholds) violate the risk constraint, with observed risks $6$--.
- Calibration: SAFE-KD reduces negative log-likelihood (NLL) and expected calibration error (ECE) at all exits.
- Risk Guarantee: Across sweeps in , the observed per-exit risk tracks the theoretical bound , confirming tightness.
- Robustness: On CIFAR-10-C corrupted data (severity 3), SAFE-KD attains lower mean corruption error (mCE) at both shallowest and deepest exits compared to comparable multi-exit and DKD-based models, e.g., mCE at exit 1: SAFE-KD $30.2$, DKD $32.8$, MultiExit $35.4$; at final exit: SAFE-KD $20.9$.
- Ablation Findings: Removing DKD degrades accuracy by and forces safer (more conservative) thresholds, increasing average depth. Removing consistency () triggers higher exit-variance, though risk guarantees persist.
- Example Table (for CIFAR-100, ResNet-50, ):
| Method | Accuracy | Exp. Depth | Observed Risk |
|---|---|---|---|
| Fixed MSP | 81.5% | 0.72 | 6.8% (Unsafe) |
| Entropy gate | 80.9% | 0.65 | 7.5% (Unsafe) |
| SAFE-KD (CRC) | 82.3% | 0.59 | 4.8% (Safe) |
SAFE-KD consistently defines the empirical Pareto frontier for the target risk constraint across tasks.
6. Calibration, Robustness, and Risk Guarantees
SAFE-KD's deployment of CRC uniquely enables it to deliver finite-sample, statistically-tight risk control not attainable with heuristic thresholds. Reliability diagrams confirm alignment of observed risk to the target across exit depths. Selective risk curves for a sweep of show empirical at or just under .
For corrupted or hard samples, the framework naturally "defers" to deeper exits, preserving guaranteed selective risk at cost of additional computation. This property, together with out-of-the-box calibration from DKD and consistency, distinguishes SAFE-KD from prior early-exit and distillation methods without such formal risk control.
7. Summary and Broader Impact
SAFE-KD constitutes a general-purpose, modular extension to vision models requiring minimal architectural modification and no retraining of the backbone. Its integration of CRC, DKD, and deep-to-shallow consistency provides user-tunable, quantifiable risk guarantees for early exiting, fine-grained calibration, and enhanced robustness under dataset shift or corruption. The framework supports frequent regression testing and online adaptation to evolving operational requirements, making it especially suitable for resource-constrained or safety-critical deployments (Khazem, 3 Feb 2026).