Multiply Robust Estimation in Multi-Stage DTRs

Updated 10 October 2025

The paper demonstrates that the multiply robust estimator remains consistent if, at each stage, at least one candidate nuisance model is correctly specified.
It employs empirical likelihood and convex optimization techniques to combine propensity score and outcome regression models for enhanced inferential stability.
Simulation studies and practical applications in survival and missing data confirm robust performance even with complex longitudinal data and model misspecification.

Multiply Robust Estimator for Multi-Stage Dynamic Treatment Regimes

Multiply robust (MR) estimation for multi-stage dynamic treatment regimes (DTRs) is a statistical methodology designed to enhance inferential stability and efficiency in the presence of sequential decisions, complex longitudinal data, and model misspecification. The key principle is that the estimator remains consistent if, at each stage, a subset of multiple nuisance models—often including propensity scores and outcome regressions—are correctly specified, thus extending the usual “doubly robust” framework to hedge against multiple forms of misspecification.

1. Definition and Conceptual Foundations

The multiply robust estimator generalizes the doubly robust paradigm. In a multi-stage DTR with $K$ decision points, the regime comprises a sequence of stage-specific decision rules, with patient history and intermediate outcomes informing each decision. At each stage $k$ , estimation relies on nuisance functions: conditional treatment probabilities (propensity models) and outcome regressions (iterated conditional means or Q-functions).

In MR constructions, consistent estimation can require that at each stage, at least one of several fitted models is correctly specified; for example, either a propensity score model or an outcome regression. Certain formulations achieve $K$ -fold or $2K$-fold robustness, reflecting the number of correct-model combinations guaranteeing consistency (Rotnitzky et al., 2017).

The MR estimator’s canonical statistical representation leverages augmentation schemes, empirical likelihood, and optimal weighting to combine predictions from distinct estimators—including those derived from parametric, semiparametric, or machine-learning methods.

2. Construction and Mathematical Formulation

The formation of MR estimators in multi-stage DTRs proceeds via iterated regression and weighting principles, frequently employing empirical process theory and moment constraints to guarantee large-sample properties (Naik et al., 2016, Rotnitzky et al., 2017).

Let $(A_1, \ldots, A_K)$ denote stage-wise treatments, $(H_1, \ldots, H_K)$ the evolving patient histories, and $Y$ the final outcome. For each stage $k$ , specify families of propensity score models $\mathcal{P}_k = \{\pi^j_k\}$ and outcome regression models $\mathcal{Q}_k = \{q^l_k\}$ . MR estimation comprises:

Combining candidate models for each nuisance function;
Imposing estimating equations or moment conditions that are satisfied if any one model in each family is correctly specified;
Solving for weighting parameters via constrained optimization (often by Lagrange multipliers or convex minimization) so that estimation is “multiply hedged.”

A representative formula for the MR estimator in longitudinal (multi-stage) settings is given in (Rotnitzky et al., 2017) for the mean of the g-functional:

$\hat{\psi}_{MR} = \frac{1}{n} \sum_{i=1}^n h(O_i; \hat{\pi}, \hat{q})$

where $O_i$ denotes the observed data, and $\hat{\pi}, \hat{q}$ indicate the estimated nuisance functions (possibly cross-fitted). The functional $h(\cdot)$ is constructed such that the estimator is CAN (consistent and asymptotically normal) under multiple union-of-correct-model conditions.

For pointwise causal effects with multivalued treatment, (Naik et al., 2016) develops MR weights as:

$\hat{w}_i^{(d_q)} = \frac{1}{m_q}\frac{1}{1 + \hat{\rho}^T \hat{g}_i} \bigg/ \sum_{j \in M_q} \frac{1}{m_q} \frac{1}{1 + \hat{\rho}^T \hat{g}_j}$

where $\hat{g}_i$ encapsulates deviations between fitted and empirical means from each candidate propensity and regression model.

3. Theoretical Guarantees: Robustness and Efficiency

The multiply robust property assures that the estimator is consistent if, for each stage, at least one model in either the propensity or regression family is correctly specified, even when some models are misspecified. In parametric settings, this has been shown to yield estimators with properties such as CAN and efficiency provided all nuisance models converge sufficiently fast (Rotnitzky et al., 2017, Naik et al., 2016).

Semiparametric efficiency is achievable when all candidate models are correctly specified, as the estimator then achieves the asymptotic variance bound corresponding to the efficient influence function (EIF). Bias formulas for MR and DR estimators are derived, showing that under many law, the rate at which the bias of the MR estimator converges to zero matches or exceeds the DR estimator’s rate (Rotnitzky et al., 2017).

When the estimation involves machine learning nuisance functions, cross-fitting and sample splitting circumvent Donsker-class restrictions, allowing for the use of highly adaptive and complex ML models without loss of asymptotic linearity.

4. Application to Complex Settings: Survival, Missing Data, Neural Networks

MR estimation has been extended to settings with survival/censored data, missing covariates/outcomes, and neural networks.

For survival outcomes with dependent censoring, the approach described in (Cho et al., 2020) utilizes backward recursion and generalized random survival forests to enable MR properties under nonparametric survival modeling. The estimator supports flexible numbers of treatment stages and arms and achieves polynomial rates of convergence with uniform consistency and valid inference for survival probabilities and truncated mean survival.
With missing data, (Suen et al., 2021) implements multiply robust estimating functions combining inverse probability weighting and regression adjustment for every missing-data pattern, providing union-of-correct-model consistency even as covariate patterns change across stages.
MR estimation circumvents hyperparameter tuning in deep learning by integrating all first-step neural network predictions rather than selecting a single architecture, maintaining $n^r$ -consistency and asymptotic normality if one candidate model is consistent (Rostami et al., 2023).

5. Practical Considerations and Implementation

MR estimators are advantageous in practice when there is a high risk of misspecification due to complex data structure (multi-stage decision, high-dimensional covariates, time-varying confounders). Simulation studies across multiple works demonstrate that MR estimators maintain low bias and valid coverage when competing DR, IPW, or outcome regression estimators fail due to model misspecification or excessive variability.

For computational implementation, MR estimators may require solving a convex optimization problem involving empirical likelihood constraints (Naik et al., 2016), iteratively updating candidate model predictions with cross-fitting, or backward recursion in dynamic regimes (Cho et al., 2020). R packages (e.g., dtrSurv (Cho et al., 2020)) and algorithms for off-policy learning have been developed for practical deployment.

When applied to electronic health records and large-scale observational data (e.g., MIMIC-III (Park et al., 8 Oct 2025)), MR estimation has delivered robust learning of optimal sequential treatment strategies under irregular missingness and truncation by death, achieving greater fidelity to observed outcomes in principal strata.

6. Extensions: Principal Stratification and Selection Bias

Recent advances incorporate principal stratification to address truncation by death, a situation where potential outcomes are ill-defined due to non-survival (Park et al., 8 Oct 2025). The principal stratification-based MR estimator targets the “always-survivor” value function, exploiting observed data and auxiliary survival models:

$V_{AS}(\pi) = E[Y^p | U = 1111 ]$

Estimation is based on constructing efficient influence functions and expressed as ratios of corrected functionals, where consistency is retained under many combinations of correctly specified models (e.g., outcome regression, survival probabilities, propensity scores).

Related methodological extensions offer triply robust estimators for average treatment effects using linked data, simultaneously guarding against unmeasured confounding and selection bias by combining three identification strategies in the EIF and estimating equation (Luo et al., 2023).

7. Limitations and Future Directions

MR estimation requires careful model specification and estimation of multiple nuisance functions. Numerical instability may arise when empirical likelihood or moment constraint equations have multiple roots; specialized convex minimization algorithms are sometimes needed to guarantee positivity and uniqueness (Naik et al., 2016). While robust, MR estimators may suffer reduced efficiency if all candidate models are misspecified. The computational burden increases with the number of candidate models and stages.

Future research directions include refinement of bias characterizations in high-dimensional and non-linear settings, further adaptation to time-to-event and continuous treatments, and diagnostic tools for sensitivity analysis against widespread model misspecification. Extensions leveraging modern machine learning algorithms, regularization, and cross-fitting remain active areas for methodological innovation.

Multiply robust estimation provides a principled framework to achieve stable, consistent, and efficient inference for multi-stage dynamic treatment regimes, especially in the presence of model uncertainty, missing data, and sequential decision-making. Theoretical developments and empirical validations across several works underscore its value for personalized medicine, causal inference, and reinforcement learning in complex longitudinal studies (Naik et al., 2016, Rotnitzky et al., 2017, Cho et al., 2020, Suen et al., 2021, Rostami et al., 2023, Luo et al., 2023, Park et al., 8 Oct 2025).