Papers
Topics
Authors
Recent
Search
2000 character limit reached

Double Robustness in Semiparametric Methods

Updated 30 March 2026
  • Double robustness is a property ensuring that an estimator remains consistent if at least one of the two nuisance functions, such as an outcome regression or a propensity score, is correctly specified.
  • It underpins efficiency in semiparametric models by leveraging influence-function orthogonality and convexity conditions to mitigate bias from model misspecification.
  • Applications include causal inference, missing data analysis, survival analysis, and high-dimensional estimation, demonstrating its practical utility across econometrics and related fields.

Double robustness is a key structural property in modern semiparametric estimation, causal inference, missing data, high-dimensional settings, and econometrics. It underpins the robustness and efficiency of a broad class of estimators by ensuring consistency if at least one among two candidate nuisance functions is correctly specified or estimated, and in many models is intimately tied to influence-function orthogonality and information-geometry properties of the statistical model.

1. Definition and Formalism

Let (Y,X)(Y, X) be observed data, with YY the outcome and XX a vector of covariates. The target parameter, denoted θ\theta, is typically a functional of the observed law, such as an average treatment effect or a mean under missing data. In the prototypical setup, there exist two nuisance functions: γ1\gamma_1 (often an outcome regression) and γ2\gamma_2 (often a propensity score or exposure mechanism).

Definition. An estimator (or, more precisely, an estimating function ϕ\phi) is doubly robust if for any law pp and any choice of γ1\gamma_1 or γ2\gamma_2,

Ep[ϕ(X;θ(p),γ1(p),γ2)]=0andEp[ϕ(X;θ(p),γ1,γ2(p))]=0E_p[\phi(X; \theta(p), \gamma_1(p), \gamma_2)] = 0 \quad \text{and} \quad E_p[\phi(X; \theta(p), \gamma_1, \gamma_2(p))] = 0

so that an estimator based on ϕ\phi is consistent for θ(p)\theta(p) if either γ1\gamma_1 or γ2\gamma_2 is correctly specified, regardless of the other (Ying, 2024).

Canonical instances include the Augmented Inverse Probability Weighted (AIPW) estimator for treatment effects, calibration estimators for compliers in IV models, and locally projected estimators in dynamic econometrics.

2. Semiparametric Theory and Influence-Function Perspective

In semiparametric models, double robustness often arises from influence-function orthogonality. A canonical result is that the influence curve for a pathwise differentiable functional θ\theta is orthogonal to the nuisance tangent space, ensuring insensitivity to infinitesimal perturbations of these nuisance components.

Under convexity of the relevant “contour sets” (sets of models with fixed target/nuisance values), the influence function is itself doubly robust “for free” (Ying, 2024). This means estimators constructed using the canonical influence function enjoy double robustness without further adjustment in a large class of models, such as partially linear regression, missing data with MAR, and standard causal inference scenarios.

3. Asymptotic Theory and Rate Double Robustness

The classical result for estimators θ^\widehat\theta built with plug-in nuisance fits γ^1\widehat\gamma_1, γ^2\widehat\gamma_2 is that θ^\widehat\theta is asymptotically normal and root-nn consistent so long as the product of the L2L_2 rates of estimation is o(n1/2)o(n^{-1/2}) (Shu et al., 2018, Sandqvist, 2024). This "rate double robustness" ensures

γ^1γ1γ^2γ2=oP(n1/2)\|\widehat\gamma_1 - \gamma_1\| \cdot \|\widehat\gamma_2 - \gamma_2\| = o_P(n^{-1/2})

implies the empirical process term dominates and the remainder vanishes. For Z-estimation with orthogonal moment equations (Neyman orthogonality), the required rate on the nuisance functions can be relaxed: as shown in (Lok, 2024), if either nuisance is oP(1)o_P(1) and the other is estimated at rate n1/4n^{-1/4}, the plug-in sandwich variance estimator for θ^\widehat\theta is consistent and the limiting law of n(θ^θ)\sqrt{n}(\widehat\theta - \theta) is unaffected by the nuisance uncertainty.

4. Classes of Double Robust Estimators

Point Estimation Examples

Model/Target DR Estimator Required Correct Models
ATT/ATE (causal inference) AIPW/TMLE Propensity score or outcome model
Complier average characteristics DR moment w/ κ\kappa Weight or regression function
Location/scale w/ MAR (missing data) AIPW + robustification PS or outcome regression
Survival w/ censoring DRCUT pseudo-outcomes Censoring or regression hazard
Local Projections (irf, time series) Direct LP estimator PL regression or "shock" model

Structural Property: For these, the estimator is consistent (and typically regular asymptotically linear) if either of the two models/nuisance estimators is correctly specified (or estimated sufficiently well), not necessarily both (Shu et al., 2018, Singh et al., 2019, Cantoni et al., 2018, Sandqvist, 2024, Olea et al., 2024).

Sequential Double Robustness (SDR)

In longitudinal data (e.g., longitudinal G-computation), sequential double robustness (SDR) arises when, at each time point of a multistage process, consistency is guaranteed if, for each tt, either the regression or the treatment model at time tt is correctly specified. This generalizes standard DR, allowing mixtures across time points (Luedtke et al., 2017).

5. Robustness, Limitations, and Fragility

Double Robustness vs. Double Fragility

While DR estimators offer protection under partial misspecification, when both nuisance models are incorrect, the error in the estimator can magnify, with bias of order the product of the two errors. This phenomenon is termed “double fragility” (Testa et al., 26 Sep 2025):

η=E[(μ^(X)μ(X))(1π(X)π^(X))]\eta = E \left[ (\widehat\mu(X) - \mu(X)) \left(1 - \frac{\pi(X)}{\widehat\pi(X)} \right) \right]

which can dominate the estimator's error if both working models are poorly fitted.

Adaptive Correction Clipping (ACC) methods have been proposed to address this, ensuring point estimates cannot be worse than the worse of the individual outcome regression or IPW estimators, thereby achieving what the authors term "double safety" (Testa et al., 26 Sep 2025).

Variance Estimation

Classical variance estimators such as the influence-function (IF) plug-in estimator are not doubly robust: IF-based variance estimation is valid only if both working models are correct. Empirical sandwich estimators and nonparametric bootstrap methods, by instead leveraging the unbiasedness of stacked estimating equations, retain double robustness for variance estimation (Shook-Sa et al., 2024).

6. Extensions and Practical Examples

High-dimensional and Machine Learning Nuisance Estimation

Sample splitting and cross-fitting enable the use of complex machine learning models for nuisance components while preserving DR properties—provided product rates are met (Sandqvist, 2024, Shu et al., 2018, Liu et al., 2023). Even in high-dd settings, plug-in DML-type estimators retain root-nn inference under appropriate sparsity and rate assumptions.

External Controls and Attaching Data Sources

Naive incorporation of external control samples into doubly robust ATT estimators can paradoxically degrade efficiency under single-model misspecification. A “double-safe” estimator optimally combines the standard (trial-only) and external-control estimators to ensure no efficiency loss relative to the best available approach in each scenario, while preserving DR (Dai et al., 24 Sep 2025).

Model Geometry and Parameterization

Information geometry and semiparametric theory provide necessary and sufficient conditions for the existence of DR model structures (Ying, 2024). Variation-independence and convexity (or m-flatness) of the “contour sets” are central: in models where contours are not convex (e.g., certain odds ratio models under canonical parameterization), true DR estimators do not exist.

7. Applications and Illustrations

Representative applications illustrating DR include:

References

  • "A Geometric Perspective on Double Robustness by Semiparametric Theory and Information Geometry" (Ying, 2024)
  • "Improved Estimation of Average Treatment Effects on the Treated: Local Efficiency, Double Robustness, and Beyond" (Shu et al., 2018)
  • "Double Robustness of Local Projections and Some Unpleasant VARithmetic" (Olea et al., 2024)
  • "Sequential Double Robustness in Right-Censored Longitudinal Models" (Luedtke et al., 2017)
  • "Doubly robust inference with censoring unbiased transformations" (Sandqvist, 2024)
  • "Double Robust Variance Estimation with Parametric Working Models" (Shook-Sa et al., 2024)
  • "Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics" (Singh et al., 2019)
  • "Robust semiparametric inference with missing data" (Cantoni et al., 2018)
  • "Rescuing double robustness: safe estimation under complete misspecification" (Testa et al., 26 Sep 2025)
  • "Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety" (Dai et al., 24 Sep 2025)
  • "Demystified: double robustness with nuisance parameters estimated at rate n-to-the-1/4" (Lok, 2024)
  • "Double robust inference for continuous updating GMM" (Kleibergen et al., 2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Double Robustness.