Double Robustness in Semiparametric Methods
- Double robustness is a property ensuring that an estimator remains consistent if at least one of the two nuisance functions, such as an outcome regression or a propensity score, is correctly specified.
- It underpins efficiency in semiparametric models by leveraging influence-function orthogonality and convexity conditions to mitigate bias from model misspecification.
- Applications include causal inference, missing data analysis, survival analysis, and high-dimensional estimation, demonstrating its practical utility across econometrics and related fields.
Double robustness is a key structural property in modern semiparametric estimation, causal inference, missing data, high-dimensional settings, and econometrics. It underpins the robustness and efficiency of a broad class of estimators by ensuring consistency if at least one among two candidate nuisance functions is correctly specified or estimated, and in many models is intimately tied to influence-function orthogonality and information-geometry properties of the statistical model.
1. Definition and Formalism
Let be observed data, with the outcome and a vector of covariates. The target parameter, denoted , is typically a functional of the observed law, such as an average treatment effect or a mean under missing data. In the prototypical setup, there exist two nuisance functions: (often an outcome regression) and (often a propensity score or exposure mechanism).
Definition. An estimator (or, more precisely, an estimating function ) is doubly robust if for any law and any choice of or ,
so that an estimator based on is consistent for if either or is correctly specified, regardless of the other (Ying, 2024).
Canonical instances include the Augmented Inverse Probability Weighted (AIPW) estimator for treatment effects, calibration estimators for compliers in IV models, and locally projected estimators in dynamic econometrics.
2. Semiparametric Theory and Influence-Function Perspective
In semiparametric models, double robustness often arises from influence-function orthogonality. A canonical result is that the influence curve for a pathwise differentiable functional is orthogonal to the nuisance tangent space, ensuring insensitivity to infinitesimal perturbations of these nuisance components.
Under convexity of the relevant “contour sets” (sets of models with fixed target/nuisance values), the influence function is itself doubly robust “for free” (Ying, 2024). This means estimators constructed using the canonical influence function enjoy double robustness without further adjustment in a large class of models, such as partially linear regression, missing data with MAR, and standard causal inference scenarios.
3. Asymptotic Theory and Rate Double Robustness
The classical result for estimators built with plug-in nuisance fits , is that is asymptotically normal and root- consistent so long as the product of the rates of estimation is (Shu et al., 2018, Sandqvist, 2024). This "rate double robustness" ensures
implies the empirical process term dominates and the remainder vanishes. For Z-estimation with orthogonal moment equations (Neyman orthogonality), the required rate on the nuisance functions can be relaxed: as shown in (Lok, 2024), if either nuisance is and the other is estimated at rate , the plug-in sandwich variance estimator for is consistent and the limiting law of is unaffected by the nuisance uncertainty.
4. Classes of Double Robust Estimators
Point Estimation Examples
| Model/Target | DR Estimator | Required Correct Models |
|---|---|---|
| ATT/ATE (causal inference) | AIPW/TMLE | Propensity score or outcome model |
| Complier average characteristics | DR moment w/ | Weight or regression function |
| Location/scale w/ MAR (missing data) | AIPW + robustification | PS or outcome regression |
| Survival w/ censoring | DRCUT pseudo-outcomes | Censoring or regression hazard |
| Local Projections (irf, time series) | Direct LP estimator | PL regression or "shock" model |
Structural Property: For these, the estimator is consistent (and typically regular asymptotically linear) if either of the two models/nuisance estimators is correctly specified (or estimated sufficiently well), not necessarily both (Shu et al., 2018, Singh et al., 2019, Cantoni et al., 2018, Sandqvist, 2024, Olea et al., 2024).
Sequential Double Robustness (SDR)
In longitudinal data (e.g., longitudinal G-computation), sequential double robustness (SDR) arises when, at each time point of a multistage process, consistency is guaranteed if, for each , either the regression or the treatment model at time is correctly specified. This generalizes standard DR, allowing mixtures across time points (Luedtke et al., 2017).
5. Robustness, Limitations, and Fragility
Double Robustness vs. Double Fragility
While DR estimators offer protection under partial misspecification, when both nuisance models are incorrect, the error in the estimator can magnify, with bias of order the product of the two errors. This phenomenon is termed “double fragility” (Testa et al., 26 Sep 2025):
which can dominate the estimator's error if both working models are poorly fitted.
Adaptive Correction Clipping (ACC) methods have been proposed to address this, ensuring point estimates cannot be worse than the worse of the individual outcome regression or IPW estimators, thereby achieving what the authors term "double safety" (Testa et al., 26 Sep 2025).
Variance Estimation
Classical variance estimators such as the influence-function (IF) plug-in estimator are not doubly robust: IF-based variance estimation is valid only if both working models are correct. Empirical sandwich estimators and nonparametric bootstrap methods, by instead leveraging the unbiasedness of stacked estimating equations, retain double robustness for variance estimation (Shook-Sa et al., 2024).
6. Extensions and Practical Examples
High-dimensional and Machine Learning Nuisance Estimation
Sample splitting and cross-fitting enable the use of complex machine learning models for nuisance components while preserving DR properties—provided product rates are met (Sandqvist, 2024, Shu et al., 2018, Liu et al., 2023). Even in high- settings, plug-in DML-type estimators retain root- inference under appropriate sparsity and rate assumptions.
External Controls and Attaching Data Sources
Naive incorporation of external control samples into doubly robust ATT estimators can paradoxically degrade efficiency under single-model misspecification. A “double-safe” estimator optimally combines the standard (trial-only) and external-control estimators to ensure no efficiency loss relative to the best available approach in each scenario, while preserving DR (Dai et al., 24 Sep 2025).
Model Geometry and Parameterization
Information geometry and semiparametric theory provide necessary and sufficient conditions for the existence of DR model structures (Ying, 2024). Variation-independence and convexity (or m-flatness) of the “contour sets” are central: in models where contours are not convex (e.g., certain odds ratio models under canonical parameterization), true DR estimators do not exist.
7. Applications and Illustrations
Representative applications illustrating DR include:
- Causal inference for ATE/ATT in the presence of confounding (Shu et al., 2018, Dai et al., 24 Sep 2025).
- Inference for complier populations in instrumental-variable analysis (Singh et al., 2019).
- Average outcome/location/scale estimation with missing data and MAR (Cantoni et al., 2018).
- Rate double robustness of DML and flexible ML-driven approaches (Sandqvist, 2024, Liu et al., 2023).
- DR inference for conditional means under coarsening at random censoring (Sandqvist, 2024).
- Double-robust hypothesis testing in over-identified GMM (Kleibergen et al., 2021).
- LP-based inference in time-series settings with unmodeled serial correlation (Olea et al., 2024).
References
- "A Geometric Perspective on Double Robustness by Semiparametric Theory and Information Geometry" (Ying, 2024)
- "Improved Estimation of Average Treatment Effects on the Treated: Local Efficiency, Double Robustness, and Beyond" (Shu et al., 2018)
- "Double Robustness of Local Projections and Some Unpleasant VARithmetic" (Olea et al., 2024)
- "Sequential Double Robustness in Right-Censored Longitudinal Models" (Luedtke et al., 2017)
- "Doubly robust inference with censoring unbiased transformations" (Sandqvist, 2024)
- "Double Robust Variance Estimation with Parametric Working Models" (Shook-Sa et al., 2024)
- "Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics" (Singh et al., 2019)
- "Robust semiparametric inference with missing data" (Cantoni et al., 2018)
- "Rescuing double robustness: safe estimation under complete misspecification" (Testa et al., 26 Sep 2025)
- "Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety" (Dai et al., 24 Sep 2025)
- "Demystified: double robustness with nuisance parameters estimated at rate n-to-the-1/4" (Lok, 2024)
- "Double robust inference for continuous updating GMM" (Kleibergen et al., 2021)