Rate Doubly Robust Estimator

Updated 7 October 2025

Rate doubly robust estimation is a semiparametric method that remains valid if one nuisance model converges at an o_p(n^(-1/4)) rate.
It employs drift correction through Gaussianization and cross-fitting to harness adaptive machine learning while achieving √n consistency.
Applications in survival analysis with informative censoring demonstrate reduced bias and valid confidence intervals compared to classical methods.

A rate doubly robust estimator is a statistical procedure that targets semiparametric parameters in the presence of nuisance functions, constructed so that it achieves root-n (√n) consistency and asymptotic normality under substantially weaker requirements on nuisance estimation than classical doubly robust estimators. In particular, for survival analysis with informative censoring and possibly non-random treatment assignment, such an estimator guarantees valid inference if at least one of two nuisance elements—either the outcome model or the model for treatment and censoring mechanisms—is estimated at the rate $o_p(n^{-1/4})$ . The core role of rate doubly robust estimation is to exploit flexible, data-adaptive regression or machine learning methods (e.g., random forests, super learner ensembles) without requiring both nuisance estimators to converge at the √n (parametric) rate, thus permitting valid inference when employing modern “black-box” models (Díaz, 2017).

1. Construction and Theoretical Foundations

The estimator is defined by constructing an estimating equation based on the efficient influence function (EIF) of the target parameter, such as the marginal survival probability $\theta_0 = E_0[S(\tau, W)]$ . The EIF for this context with survival outcomes under right-censoring and possibly non-random treatment is: $D_{(\eta, \theta)}(O) = -\sum_{t=1}^\tau \frac{1\{A=1\} I_t}{g_A(W) G(t, W)} \frac{S(\tau, W)}{S(t, W)} \{L_t - h(t, W)\} + S(\tau, W) - \theta$ where:

$h(t, W)$ is the hazard regression,
$g_A(W)$ is the propensity model,
$g_R(t, W)$ and $G(t, W)$ model the censoring process,
$I_t$ is the at-risk indicator at time $t$ ,
$S(t, W)$ is the conditional survival function.

Classical doubly robust estimators are consistent if at least one nuisance model is correctly specified. However, only under both models converging at $n^{-1/2}$ (the parametric rate) does classical asymptotic normality and valid inference follow. The rate doubly robust estimator instead guarantees that if one nuisance estimator achieves $o_p(n^{-1/4})$ rate, the estimator remains $n^{1/2}$ -consistent and asymptotically normal (i.e., $\sqrt{n}(\hat{\theta} - \theta_0) \stackrel{d}{\to} \mathcal{N}(0, \sigma^2)$ ).

2. Methodological Innovations: Gaussianizing the Drift and Cross-Fitting

Rate doubly robust estimation achieves $n^{1/2}$ -consistency by neutralizing the so-called “drift term” $\beta(\hat{\eta})$ that arises in the asymptotic expansion when flexible data-adaptive methods are used for nuisance functions. The estimator is constructed to ensure: $\beta(\hat{\eta}) = O_p(n^{-1/2})$ even when one nuisance function is inconsistent, provided the other achieves the $n^{-1/4}$ rate.

Gaussianizing the Drift

The drift is itself a sum of auxiliary scores (Theorem 1):

$\beta(\hat{\eta}) = P_0\{D_{A, \hat{g}} + D_{R, \hat{g}} + D_{L, \hat{h}}\} + o_p(n^{-1/2})$

These extra components are “Gaussianized” (made asymptotically linear) by targeting the estimator—solving a set of linear score equations via logistic tilting models to set the empirical drift to zero.

Cross-Fitting

Used to sidestep entropy (Donsker class) requirements that can be limiting for high-complexity or high-dimensional learners. The data are split into folds; nuisance estimators are trained on all data except the current fold and evaluated on the hold-out fold. This decouples the estimation of nuisance parameters from the evaluation, enabling the use of machine learning algorithms without sacrificing asymptotic normality.

3. Asymptotic Variance and Inference

After drift correction, the targeted minimum loss-based estimator (TMLE) has an asymptotically linear representation: $n^{1/2}(\hat{\theta} - \theta_0) \stackrel{d}{\to} \mathcal{N}(0, \sigma^2)$ with

$\sigma^2 = \operatorname{Var}\left\{ D_{(\eta_1, \theta_0)}(O) - D_{L, h_1}(O) - D_{R, g_1}(O) - D_{A, g_1}(O) \right\}$

If both nuisance models are correctly specified, the extra score terms vanish and the EIF reduces to the standard efficient influence function, and thus the estimator attains semiparametric efficiency.

This variance formula enables doubly robust confidence intervals and hypothesis testing: $\hat{\theta} \pm z_{\alpha} \hat{\sigma} / \sqrt{n}$ which are valid under the weak rate conditions central to the methodology.

4. Performance in Simulations

Simulations compare estimator performance across various regimes:

Both Nuisances Consistent: Minimal bias, low variance, and MSE decreasing at the $\sqrt{n}$ rate.
Only One Consistent: Drift-corrected estimator exhibits significantly reduced bias and valid coverage, outperforming traditional DR estimators which show substantial finite-sample bias.
Small Samples: Estimated standard errors based on the asymptotic formula accurately reflect true variability, with confidence interval coverage approaching nominal levels (e.g., 95%).

Crucially, even when only one nuisance estimator is consistent, the estimator provides valid inference and robust coverage.

5. Application to Clinical Trial Data

The estimator is applied in the North Central Cancer Treatment Group N9831 phase III trial analyzing trastuzumab (Herceptin) for HER2+ breast cancer:

Data: 1,390 patients, 16 years follow-up, right-censoring due to dropout and staggered entry.
Nuisance estimation: Super Learner ensembles (random forests, gradient boosting, logistic regression) with cross-fitting for g and h.
Results: At 12 years, the survival difference (treated-untreated) by the doubly robust approach is $\approx 0.107$ (SE $0.036$), markedly higher than the naïve Kaplan–Meier ( $\approx 0.044$ ), indicating robust bias reduction via proper adjustment for informative censoring.
This demonstrates substantial bias in uncorrected methods and underscores the practical advantage of rate DR estimators in complex, censored data with possible unmeasured confounding.

6. Practical Implications and Limitations

Implications:

Enables analysts to use highly flexible machine learning tools for nuisance estimation without needing both models to be accurate at parametric rates.
Confidence intervals and standard errors derived from the EIF-based variance formula remain valid in high-dimensional covariate settings.
Drift correction is essential: Neither conventional DR nor plug-in approaches can deliver root-n inference with only $n^{-1/4}$ -rate convergence in nuisance estimation.

Limitations:

If both nuisance estimators are inconsistent (do not converge), neither DR nor rate DR guarantees consistency.
Computational complexity increases (due to cross-fitting and drift correction targeting), though scalable implementations are feasible due to advances in ensemble learning and convex optimization.
Asymptotic results hold under the stated regularity and positivity conditions; pathological violations (like positivity near-zero) can still invalidate inference.
Rate DR guarantees depend on at least one nuisance function being estimated at $n^{-1/4}$ or better—underscoring the importance of robust, well-regularized machine learning algorithms and proper cross-validation/tuning.

7. Summary Table: Comparison with Classical DR Estimation

Feature	Classical DR	Rate Doubly Robust
Consistency under	Either nuisance	Either at $n^{-1/4}$ rate
$\sqrt{n}$ inference	Both nuisances at $\sqrt{n}$ rate	One at $n^{-1/4}$ rate and drift correction
Gaussianization/drift	Absent	Explicitly targeted
Cross-fitting	Optional	Essential
Nuisance estimation	Parametric / low-dim	Machine learning/adaptive
Valid Wald CI when one inconsistent	No	Yes

By ensuring asymptotic normality and valid inference under minimal rate conditions, the rate doubly robust estimator extends double robustness from consistency to inference, opening the door to robust, high-dimensional semi- and nonparametric estimation in survival and causal inference problems (Díaz, 2017).

PDF Markdown Chat (Pro)

References (1)

Statistical Inference for Data-adaptive Doubly Robust Estimators with Survival Outcomes (2017)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Rate Doubly Robust Estimator.