Split & Weighted Conformal Prediction
- The method provides finite-sample prediction intervals by using data-splitting and weighted quantiles to adjust for covariate shift and sampling bias.
- Weighted conformal prediction utilizes importance sampling and group/mask-based strategies to correct for discrepancies between calibration and test distributions.
- Empirical results and theoretical guarantees show improved efficiency and robustness, yielding narrower prediction sets under varied distributional challenges.
Split and weighted @@@@1@@@@ refer to a class of nonparametric, distribution-free methods for uncertainty quantification in statistical prediction, specifically designed to provide finite-sample coverage guarantees even in the presence of covariate shift, class imbalance, missing data, or heterogeneity across groups. These methods generalize traditional conformal prediction by replacing the standard empirical quantiles of nonconformity scores with weighted quantiles, adjusting for discrepancies between the calibration and target (test) distributions. This framework includes compatibility with data-splitting (split conformal) and various weighting schemes, leveraging importance sampling and group membership to correct for sampling biases and distributional shifts.
1. Foundations of Split Conformal Prediction
The split conformal prediction procedure operates by dividing the available data into a training set and a calibration set. A regression or classification model (or in some treatments) is fitted using the training set. For each example in the calibration set, a nonconformity score is computed; in regression, this is often the absolute residual .
To make a prediction at a new covariate , the interval
is output, where is the empirical quantile of . By exchangeability of the scores on the calibration set and the new test point’s score, this construction satisfies the marginal coverage guarantee
without assumptions on the model (Tibshirani et al., 2019, Bhattacharyya et al., 30 Jan 2024).
This structure is central to subsequent weighted generalizations.
2. Weighted Conformal Prediction: Covariate Shift and Weighted Exchangeability
When the distribution of covariates in the calibration set () differs from that in the target/test set (), i.e., under covariate shift, marginal validity can be lost. Weighted conformal prediction restores validity by incorporating importance weights. Specifically, the weight function is defined as
Under the covariate shift model, the (augmented) nonconformity scores plus the test point are "weighted exchangeable": their joint probability law can be factorized as a product of functions of each coordinate (the weights) times a symmetric function. This generalizes exchangeability and enables extending the conformal argument to the new setting (Tibshirani et al., 2019).
The weighted quantile replaces the ordinary empirical quantile, with
This weighted estimator guarantees
provided the weights are known or accurately estimated (Tibshirani et al., 2019, Bhattacharyya et al., 30 Jan 2024).
The weighted-conformal procedure is summarized in the following steps (Tibshirani et al., 2019):
- Compute residuals and weights for the calibration set.
- For each test , compute the weighted quantile .
- Output .
Empirical results demonstrate restoration of nominal coverage under simulated covariate shift.
3. Generalizations: Group, Mask, and Frequency-based Weighting
Beyond classical covariate shift, weighted conformal prediction encompasses structures such as:
Group-weighted conformal prediction: When the population is partitioned into discrete groups with group distribution shifting from to , the weights are for group . The prediction set is calibrated using a group-weighted mixture distribution over conformal scores:
Coverage guarantees improve drastically over general weighting in settings with fixed, well-sampled groups (Bhattacharyya et al., 30 Jan 2024).
Mask-weighted conformal prediction under missing data: Given general missings given by mask patterns , weighted conformal prediction with mask-conditional weighting is deployed after calibration set imputation and masking:
Prediction sets constructed using these weights achieve both marginal and mask-conditional validity, extending guarantees to data missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR) (Fan et al., 16 Dec 2025).
Class-label and frequency-based weighting in imbalanced/open-set settings: Selective sample splitting with probabilistic inclusion functions (as a function of class frequency) is used to prioritize rare classes in calibration. The resulting nonconformity scores are weighted according to explicit formulas based on label frequencies, restoring validity under severe class imbalance or when the label space expands at test time (Xie et al., 14 Oct 2025).
4. Weighted Aggregation of Multiple Conformity Scores
Weighted conformal prediction is not restricted to covariate-based weighting. In multi-class classification, prediction set size can be drastically reduced by aggregating multiple nonconformity (or conformity) scores in a weighted fashion. For a collection of scores , the aggregated score is used in the split-conformal construction. Weights are chosen—via constrained optimization—to minimize average prediction set size subject to the coverage constraint:
where denotes the conformal prediction set generated with (Luo et al., 14 Jul 2024). Coverage is preserved marginally, and theoretical analysis establishes connections to subgraph classes in VC theory.
Empirical results on image classification demonstrate substantial set-size reductions for the same coverage, with the method outperforming all single-score conformal predictors.
5. Theoretical Guarantees and Effective Sample Size
Weighted conformal prediction provides strong, minimally-assumptive finite-sample guarantees:
- Marginal coverage: When the appropriate weights are known or accurately estimated, coverage at the nominal level is attained.
- Conditional/structured coverage: For group-weighted or mask-conditional weighting, coverage bounds match or improve on classical conformal prediction, with precise finite-sample deficits characterized in terms of group sizes or estimation error (Bhattacharyya et al., 30 Jan 2024, Fan et al., 16 Dec 2025).
- Robustness to estimated weights: Miscoverage due to estimated weights is bounded by a function of the error between estimated and Oracle weight functions; the effect can be quantitatively evaluated (Bhattacharyya et al., 30 Jan 2024).
- Effective sample size: The variance of weights influences the effective sample size , impacting prediction set width particularly under severe covariate shift or imbalanced groups (Tibshirani et al., 2019).
6. Practical Considerations and Empirical Illustration
- Weight estimation: In practice, test-to-train likelihood ratios for covariate shift can be estimated by probabilistic classifiers trained on labeled mixture of calibration and test covariates, e.g., logistic regression or random forests (Tibshirani et al., 2019). For group or mask settings, frequencies or mask-pattern classifiers are used.
- Computational cost: Split and weighted conformal prediction yield per-query complexity, as only calibration scores need be processed, in contrast to full conformal methods, which can be or higher (Tibshirani et al., 2019).
- Empirical performance: On both regression and classification tasks with covariate shift, group shift, imbalance or missingness, weighted procedures restore target coverage and often yield more informative (narrower) intervals or smaller prediction sets compared to standard split conformal, with efficiency scaling with the quality of weighting and sample size (Tibshirani et al., 2019, Fan et al., 16 Dec 2025, Bhattacharyya et al., 30 Jan 2024).
The table below gives a summary of practical aspects:
| Scenario | Weight Type (w) | Coverage Guarantee |
|---|---|---|
| Covariate Shift | marginal | |
| Group Shift | ||
| Mask Missingness | (mask dep.) | mask-conditional |
| Score Aggregation (multi-class) | (score weights) | marginal |
7. Extensions and Connections
Weighted and split conformal prediction form the technical foundation for multiple recent developments in distribution-free predictive inference:
- Selective splitting and weighting for open-set and class-imbalanced problems, incorporating new label detection and Good-Turing estimation (Xie et al., 14 Oct 2025).
- Mask-conditional valid sets via reweighting and acceptance-rejection calibrated on imputed datasets for arbitrary missing data patterns (Fan et al., 16 Dec 2025).
- Localized, kernel-based conformal methods (split localized conformal prediction) delivering nearly conditional coverage through kernel weighted quantiles (Han et al., 2022).
- Group-weighted approaches yielding sharp finite-sample coverage in stratified or grouped data structures, outperforming generic density-ratio-based weighting when group membership drives the shift (Bhattacharyya et al., 30 Jan 2024).
These advances demonstrate the broad applicability of split and weighted conformal prediction, with theoretical and empirical support for nearly all modern predictive settings involving distribution shift, heterogeneity, imbalance, or missingness.