IPW-Z: Inverse-Probability-Weighted Z-Estimator

Updated 12 September 2025

The estimator integrates Horvitz–Thompson and Hãjek bias corrections within prediction-powered inference to adjust for non-uniform label probabilities.
It employs a 1/ξ weighting strategy that ensures unbiased population mean estimates even when selection is covariate-dependent.
Simulation studies show that both IPW variants achieve nominal coverage and reduced variance compared to naive unweighted approaches.

an IPW version such that the bias-correction term is weighted by 1/ξᵢ. This connects the PPI approach directly to classical design-based estimators from survey sampling, ensuring unbiasedness even when the chance of observation varies with observed covariates. The two most prominent forms for the IPW rectifier are the Horvitz–Thompson (HT) and Hãjek adjustments, both reflected in the revised PPI estimator structure.

1. Integration of IPW into Prediction-Powered Inference

The fundamental estimator in prediction-powered inference (PPI) for a finite population mean θ* = (1/N)∑₁ᴺ Yᵢ is

$\hat\theta_\text{PPI} = \frac{1}{N} \sum_{i=1}^N f(X_i) - \hat{\Delta}$

where $f(\cdot)$ is a regression or prediction model trained on partial labels, and $\hat{\Delta}$ estimates the mean generalization error using labeled data. Under informative labeling, with labeling probability ξᵢ, the bias-correction term becomes

$\hat{\Delta}_{HT} = \frac{1}{N} \sum_{i=1}^N \frac{R_i}{\xi_i} \left\{ f(X_i) - Y_i \right\}$

where $R_i$ indicates whether $Y_i$ is observed. The Hãjek variant uses a normalization by the total sum of weights: $\hat{\Delta}_{Hajek} = \frac{ \sum_{i=1}^N \frac{R_i}{\xi_i} \left\{f(X_i) - Y_i\right\} }{ \sum_{i=1}^N \frac{R_i}{\xi_i} }$ Both estimators ensure that PPI remains design-unbiased if the inclusion probabilities are known or consistently estimated, even under highly non-uniform and covariate-dependent sampling.

2. Methodological Unification with Survey Sampling

The integration of IPW into PPI unifies classic design-based inference (Horvitz–Thompson and Hãjek estimators) with modern prediction-driven approaches. In the presence of labeling mechanisms where the selection probability varies with X, the design-based correction using IPW is essential to restoring unbiasedness. The methodology employs either observed probabilities, if available, or estimated probabilities (e.g., via logistic regression, if the sampling model is known up to covariates). Both the HT and Hãjek versions extend naturally to the PPI setting, directly paralleling their roles in finite-population survey sampling.

This framework can be summarized as:

Estimator	Correction for Label Bias	Normalization
Horvitz–Thompson	$\frac{1}{N} \sum_{i} \frac{R_i}{\xi_i} (f(X_i)-Y_i)$	Denominator: $N$
Hãjek	$\frac{\sum_{i} \frac{R_i}{\xi_i} (f(X_i)-Y_i)}{\sum_{i} \frac{R_i}{\xi_i}}$	Denominator: sum of weights

The normalization choice affects both variance and small-sample bias properties, in line with the classic distinctions between Horvitz–Thompson and Hãjek.

3. Simulation Study Results

Simulations in the paper are conducted under realistic scenarios where labeling propensities ξᵢ depend on X, resulting in informative (non-MCAR) missingness. The key findings are:

Bias: The naïve PPI estimator (using unweighted rectifier) is biased when label selection depends on X. Both the IPW-Horvitz–Thompson and IPW-Hãjek corrections essentially eliminate this bias when inclusion probabilities are correctly specified or consistently estimated.
Variance: PPI estimators (both unweighted and IPW-corrected) have narrower 95% confidence intervals than fully design-based estimators when the labeled subset is small ( $p_{\text{lab}}$ as low as 1–2%). Between the two IPW forms, the Hãjek typically achieves slightly tighter intervals, reflecting improved variance when sample sizes or effective weights are moderate.
Coverage: Both IPW-corrected estimators attain nominal confidence interval coverage, matching the performance seen when inclusion probabilities are known.
Estimation of Probabilities: When only estimated labeling probabilities are available (fit via correctly specified models), the mean squared error, coverage, and interval width of the IPW-corrected PPI estimator closely mirrors the idealized case, validating feasibility for real-world use.

4. Practical Implications and Recommendations

The use of IPW within the PPI framework allows for robust inference in domains with broad, heterogeneous data-collection rules—arising in survey sampling, medical research, remote sensing, and many big data contexts. When units are labeled non-uniformly (covariate-dependent or stratified label selection), classic model-free PPI cannot guarantee unbiasedness or correct coverage. By integrating IPW-based bias correction:

Researchers can validly combine model-based predictions over large unlabeled sets with efficient design-based corrections from small and potentially informative labeled samples.
Practitioners need only estimate inclusion probabilities (propensities) given observed features to enable IPW correction.

Key recommendations include:

Use the Horvitz–Thompson or Hãjek IPW rectifiers for PPI when label probabilities are non-uniform.
Prefer the Hãjek normalization for variance reduction, except in very small effective sample scenarios where small-sample bias may be of concern.
Estimate labeling probabilities by flexible methods (e.g., nonparametric or semi-parametric models) when the true mechanism is unknown.

5. Limitations, Future Directions, and Open Problems

There are several caveats and directions for further research:

Model Assumptions: Correct specification of propensity models is essential for unbiasedness—if the inclusion probabilities are misspecified, IPW-corrected PPI estimators may retain bias.
Missing Not at Random (MNAR): The approach requires a Missing at Random (MAR) assumption (labels independent of outcome given X). If MNAR, additional modeling would be required.
Bias–Variance Tradeoff: Hãjek normalization introduces non-negligible bias in finite samples with rare/non-uniform observations, which may require cautious application in extreme settings. Adaptive, data-driven strategies to select between rectifier forms could be an area for methodological development.
Extensions: Extensions to more complex estimands, such as quantiles or functionals beyond the mean, require further investigation. An important future direction is the development of cross-fitting and binning/smoothing strategies for estimating propensities in high dimensions.
Complex Designs: Cases with time-dependent or hierarchical label mechanisms have not been fully addressed; adapting the current framework to complex design settings remains an open question.

Conclusion

By uniting prediction-powered inference with robust, design-based IPW corrections (Horvitz–Thompson and Hãjek), the methodology ensures unbiased and efficient estimation of population-level functionals under arbitrary informative labeling mechanisms. IPW-corrected PPI estimators match nominal coverage and variance properties seen when probabilities are known, and simulation evidence confirms empirical validity even when propensities must be estimated. This framework provides a rigorous, practical pathway for high-precision, partially supervised inference in heterogeneous, big data environments characterized by non-uniform sampling and complex label missingness (Datta et al., 13 Aug 2025).

PDF Markdown Chat (Pro)

References (1)

Prediction-Powered Inference with Inverse Probability Weighting (2025)

Follow Topic

Get notified by email when new papers are published related to Inverse-Probability-Weighted Z-Estimator (IPW-Z).