Papers
Topics
Authors
Recent
Search
2000 character limit reached

Reweighted Conformal Prediction Procedure

Updated 17 December 2025
  • The paper introduces a method that guarantees both marginal and mask-conditional coverage by adjusting prediction intervals after imputation.
  • It employs a preimpute–mask–then–correct pipeline using importance weighting or acceptance-rejection to handle various missing data mechanisms like MCAR, MAR, and MNAR.
  • Empirical evaluations demonstrate up to a 30% reduction in prediction interval width while maintaining reliable coverage across different datasets.

A reweighted conformal prediction procedure is a statistical method designed to provide reliable uncertainty quantification when features (covariates) in the dataset are subject to missingness. The foundational challenge it addresses is that classical conformal prediction (CP) does not guarantee valid coverage in the presence of missing covariates, especially when coverage must be controlled not only marginally, but for each possible missingness pattern (known as mask-conditional coverage, MCV). The reweighted conformal approach provides both marginal and mask-conditional coverage guarantees by correcting prediction intervals post-imputation, via importance reweighting or acceptance-rejection schemes. This enables compatibility with standard distributional imputation pipelines while delivering sharper, adaptively valid prediction sets even under complex missing data mechanisms, such as MCAR, MAR, or MNAR (Fan et al., 16 Dec 2025).

1. Problem Formulation and Background

Consider a supervised regression or classification framework where the data-generating process produces i.i.d. triples (X,M,Y)(X, M, Y), with X∈XX\in\mathcal{X} the covariate vector of dimension dd, M∈{0,1}dM\in\{0,1\}^d a mask indicating missingness (mi=1m_i=1 if xix_i is missing, 0 otherwise), and Y∈YY\in\mathcal{Y} the response or label. The observed features are X~=mask(X,M)\tilde X = \mathrm{mask}(X, M), with unobserved entries replaced by NA. In this context, conformal prediction aims to construct a prediction set C^α(X~)\hat C_\alpha(\tilde X) such that, for new data,

P(Y∈C^α(X~))≥1−α(MV),P(Y∈C^α(X~)∣M=m)≥1−α(MCV)P\big(Y \in \hat C_\alpha(\tilde X)\big) \geq 1-\alpha \quad \text{(MV)}, \quad P\big(Y \in \hat C_\alpha(\tilde X) \mid M = m \big) \geq 1 - \alpha \quad \text{(MCV)}

for all masks X∈XX\in\mathcal{X}0 with nonzero probability mass. Standard split conformal prediction furnishes only marginal validity (MV) under exchangeability; mask-conditional validity (MCV) is generally not guaranteed, especially under heteroskedastic or non-random missing patterns (Fan et al., 16 Dec 2025).

2. The Preimpute–Mask–Then–Correct Framework

To handle missing covariates, the framework consists of the following pipeline:

  1. Preimpute: Each calibration sample X∈XX\in\mathcal{X}1 is subjected to distributional imputation via any probabilistic mechanism X∈XX\in\mathcal{X}2, yielding imputed X∈XX\in\mathcal{X}3 supported on the observed entries.
  2. Mask: For a test-time missingness pattern (mask) X∈XX\in\mathcal{X}4, each imputed calibration instance is re-masked so that only the features indicated as observed by X∈XX\in\mathcal{X}5 are retained.
  3. Correct: A correction step is applied to address the distributional shift between the imputed-masked calibration set (X∈XX\in\mathcal{X}6) and the true masked-conditional law (X∈XX\in\mathcal{X}7). This is achieved either by reweighting calibration points via the likelihood ratio X∈XX\in\mathcal{X}8, or by acceptance-rejection sampling to simulate draws from X∈XX\in\mathcal{X}9.

A schematic table of the high-level steps:

Step Description Output
Preimpute Draw imputed dd0 for each calibration dd1 dd2
Mask Apply mask dd3 to each dd4 dd5
Correct Apply weighting or rejection sampling for coverage Adjusted calibration scores

This method leverages the calibration regime of split CP and is agnostic to the specific mechanism of imputation dd6.

3. Weighted and Acceptance–Rejection Correction Methods

3.1 Weighted Conformal Prediction

Under the assumption dd7, compute the likelihood ratio (importance weight)

dd8

for each calibration instance and test candidate. The normalized conformal weights for prediction set construction become, for dd9,

M∈{0,1}dM\in\{0,1\}^d0

Prediction sets are computed by evaluating the M∈{0,1}dM\in\{0,1\}^d1 quantile over the weighted empirical distribution of nonconformity scores, yielding

M∈{0,1}dM\in\{0,1\}^d2

where M∈{0,1}dM\in\{0,1\}^d3 is the nonconformity measure and M∈{0,1}dM\in\{0,1\}^d4 is the weighted quantile (Fan et al., 16 Dec 2025).

3.2 Acceptance–Rejection Corrected CP

If the likelihood ratio M∈{0,1}dM\in\{0,1\}^d5 is bounded, perform acceptance-rejection by sampling M∈{0,1}dM\in\{0,1\}^d6; accept calibration point M∈{0,1}dM\in\{0,1\}^d7 if M∈{0,1}dM\in\{0,1\}^d8 with M∈{0,1}dM\in\{0,1\}^d9. The accepted subset follows the target mi=1m_i=10 law post-masking. Split CP is then run as usual on this subset (Fan et al., 16 Dec 2025).

4. Theoretical Validity and Robustness Results

Exact Coverage Guarantees

  • Weighted CP: Under absolute continuity and exchangeability, the weighted procedure provides exact mask-conditional validity: mi=1m_i=11.
  • ARC CP: Provided mi=1m_i=12 is bounded, acceptance-rejection creates a calibration set i.i.d. under mi=1m_i=13; split CP then ensures mi=1m_i=14 (Fan et al., 16 Dec 2025).

Effect of Imperfect mi=1m_i=15 Estimation

If only an estimated mi=1m_i=16 (normalized to mi=1m_i=17) is available, the coverage is controlled up to a total variation penalty: mi=1m_i=18 with mi=1m_i=19 total variation between the estimated and true mask-conditional laws. The practical implication is that weight estimation should be sufficiently accurate (e.g., test-set xix_i0 ensures empirical miscoverage is acceptably small) (Fan et al., 16 Dec 2025).

Necessity of Correction

Empirical ablation shows that omitting the correction step ("impute–mask–split" only) leads to under-coverage (e.g., worst-case mask-conditional coverage xix_i189% when targeting 90%), while reweighting or ARC restores validity.

5. Empirical Evaluation and Performance

Experiments span both synthetic and real tabular datasets, evaluating the procedures under MCAR, MAR, and MNAR mechanisms. Key findings:

  • Synthetic: For xix_i2 covariates, under 50% MCAR, both weighted CP and ARC CP attain nominal 90% mask-conditional coverage. They reduce average interval width by roughly 30% compared to the MDA-Nested baseline, which is substantially more conservative.
  • Real-world: On datasets such as UCI Concrete (8 features, xix_i3 missing per test), Bike-sharing, and MEPS19, both procedures preserve desired coverage per mask and shrink prediction intervals by xix_i4 relative to conservative alternatives.
  • ARC vs. Weighted: ARC CP is computationally fast, reduces width further on some tasks, and does not break coverage when used with a slightly inflated xix_i5 (acceptance rate xix_i6 recommended).

A summary comparison of selected methods:

Method Mask-Conditional Valid Average Width Reduction vs. Baseline
Weighted CP Yes xix_i710–30%
ARC CP Yes Largest
MDA-Nested Conservative None
Naive Split No Narrower but under-covers

6. Implementation and Practical Guidance

Integration into standard supervised pipelines is streamlined:

  • Any off-the-shelf distributional imputer (e.g., MICE, Bayesian Ridge) fills the calibration set a single time.
  • The test instance is not imputed; instead, the imputed calibration set is masked to match the test-time missingness.
  • Correction is applied post hoc: either weighted CP or ARC CP is used to calibrate prediction sets from any base model.
  • For weighted CP, a coarse search grid over xix_i8 is recommended (xix_i9 of response range).
  • For ARC CP, Y∈YY\in\mathcal{Y}0 should be slightly larger than the estimated max Y∈YY\in\mathcal{Y}1, ensuring adequate sample size.
  • For Y∈YY\in\mathcal{Y}2 estimation, balanced classifier approaches (e.g., histogram-GBDT, logistic regression) with enough calibration data (Y∈YY\in\mathcal{Y}3) yield robust results; extremely poor weight estimation degrades coverage reliability (Fan et al., 16 Dec 2025).

7. Limitations and Prospective Directions

The primary limitations include the requirement for accurate estimation (or boundedness) of the likelihood ratio Y∈YY\in\mathcal{Y}4; grossly inaccurate estimators or extreme unboundedness can break coverage or reduce acceptance rates in ARC CP. Weighted CP quantile searches may be computationally expensive for fine grids or high-dimensional Y∈YY\in\mathcal{Y}5. Potential future extensions involve adaptive tuning of Y∈YY\in\mathcal{Y}6 for improved efficiency/width tradeoff in ARC, online masking for streaming data, robustification for imperfect weights, and generalization to classification or structured-output tasks with missing entries (Fan et al., 16 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Reweighted Conformal Prediction Procedure.