Principal Stratification Method

Updated 10 October 2025

Principal stratification is a causal framework that defines latent subpopulations based on joint potential intermediate outcomes under different treatment conditions.
The method integrates propensity score modeling with an EM algorithm to estimate causal effects amid issues like noncompliance and partial take-up.
It employs sensitivity analysis to assess the robustness of causal estimates against violations of unconfoundedness and other key assumptions.

Principal stratification is a causal inference framework that formally addresses scenarios with post-treatment (intermediate) variables that lie between treatment and outcome in the data-generating process. At its core, principal stratification defines subpopulations—called principal strata—based on the joint potential values that an intermediate variable would take under each possible treatment condition. Causal effects are then defined and estimated within these latent principal strata, which are unaffected by treatment assignment. This approach provides a rigorous mechanism for isolating direct or mechanism-specific effects and for dealing with problems such as noncompliance, truncation by death, surrogate endpoints, or intermediate events.

1. Conceptual Foundations and Definitions

The principal stratification framework (Mercatanti et al., 2015) operates under the Rubin Causal Model and is constructed to handle scenarios where an intermediate post-treatment variable (D) may itself be affected by treatment (Z), and this post-treatment variable is hypothesized to drive the actual change in the primary outcome (Y). A principal stratum for subject $i$ is defined by the pair $S_i = (D_i(0), D_i(1))$ , which represents the value of the intermediate variable D under both treatment assignments (Z=0 and Z=1).

Key assumptions frequently invoked include:

Monotonicity (e.g., $D_i(0) = 0$ ): One cannot use the intermediate if not treated (e.g., not use a debit card unless one possesses it).
Principal Ignorability/Unconfoundedness: $\{Y_i(1), Y_i(0), D_i(1), D_i(0)\} \perp Z_i | X_i$ ; treatment assignment is ignorable after adjustment for covariates X.

Typical causal estimands are defined within specific strata:

CATE (Complier Average Treatment Effect): $E[Y_i(1) - Y_i(0) \mid S_i = \text{c}]$ .
CATT (Average Treatment Effect for Treated Compliers): $E[Y_i(1) - Y_i(0) \mid S_i = \text{c}, Z_i = 1]$ .

This stratification correctly targets, for example, the effect of possessing/using a debit card on cash holdings only for those households who would use the card if possessed (compliers), avoiding misleading conclusions driven by never-users.

2. Model-Based Estimation with Propensity Score Adjustment

Estimation in observational settings requires adjusting for confounding due to imbalanced baseline covariates. This is accomplished by introducing the propensity score $e(X_i) = P(Z_i = 1 | X_i)$ , typically estimated via logistic regression (Mercatanti et al., 2015). The modeling process has three layers:

Propensity Score Model: Logistic regression for $e(X)$ .
Principal Strata Model: $S_i$ classification via logistic regression as a function of $e(X)$ . For example,

$\logit P(S_i=\text{never-user}| X_i=x) = \alpha_0 + \alpha e(x)$

Potential Outcome Model: Linear regression for $Y_i(z)$ , differentiated by principal stratum,

$Y_i(z) = 1_{S=c} [\beta_{c0} + z\theta_c + \beta_{c1}e(x)] + 1_{S=n} [\beta_{n0} + \beta_{n1}e(x)] + \epsilon_i$

where $\theta_c$ is the causal effect among compliers.

Maximum likelihood estimation is performed via the EM algorithm, leveraging the latent stratum as missing data. The approach efficiently uses the estimated propensity score as a single regressor in both the stratum and outcome models, reducing model mis-specification risks that may arise with model overfitting or imbalance in high-dimensional covariates.

3. Sensitivity Analysis to Violations of Unconfoundedness

Because unconfoundedness is untestable in observational studies, the model is extended for sensitivity analysis to capture:

S-confounding: Bias in the latent strata composition across treatments. Modeled by allowing stratum probabilities to depend on the observed treatment, adding a parameter $\xi$ :

$\logit P(S_i=n|Z_i=z, X_i=x) = \alpha_0 + \alpha e(x) + \xi z$

Here, $\exp(\xi)$ is an odds ratio for stratum membership between treated and untreated groups after adjusting for X.

Y-confounding: Bias in potential outcomes within strata across treatments. Modeled by introducing sensitivity parameters $\eta_c$ and $\eta_n$ in the outcome regression:

$Y_i(z_1) = 1_{S=c}[\beta_{c0} + z_1 \theta_c + z_2 \eta_c + \beta_c e(x)] + 1_{S=n} [\beta_{n0} + z_2 \eta_n + \beta_n e(x)] + \epsilon_i$

where $z_1$ is the potential outcome index; $z_2$ is the observed treatment assignment. The identification of both $\theta_c$ and $\eta_c$ is not possible from the data; thus, typically, $\eta_c$ is fixed (e.g., 0), and $\eta_n$ is varied across a plausible range (e.g.,–400 to 400, in thousands of Italian Lira).

By re-estimating causal effects over a sensitivity grid for the confounding parameters, the robustness of CATT and related estimands to violations of unconfoundedness can be assessed.

4. Implementation Workflow and Algorithmic Features

The implementation involves:

Estimate the propensity score, $e(X)$ , via logistic regression for $Z$ on pre-treatment $X$ .
Estimate principal stratum probabilities as a function of $e(X)$ using a second logistic regression, treating the observed values of $D$ for $Z=1$ and $D=0$ for $Z=0$ , with latent membership for mixed cells handled via the EM algorithm.
Fit the outcome regression, as in (2) or (4), incorporating $e(X)$ , stratum membership (latent), and treatment.
Sensitivity analysis: Repeat principal effect estimation across a grid of sensitivity parameters ( $\xi$ , $\eta_n$ ), reporting the range of observed causal effects.
Model fit and inference: Standard errors and confidence intervals are estimated via likelihood-based inference, propagating uncertainty due to the latent strata via the EM algorithm's estimated information matrix.

5. Empirical Findings and Practical Impact

Applying this methodology to household survey data from Italy demonstrates that principal stratification enables valid causal statements regarding the effect of debit card possession/use on cash holdings—specifically for the subpopulation of compliers (those actually using a debit card if possessed). Key findings (Mercatanti et al., 2015):

Substantial Reduction in Cash Holdings: The model finds a 70–80% reduction in household cash inventories among users, a statistically robust effect (as evaluated using the sensitivity analysis).
Robustness: The results are highly insensitive to plausible violations of the unconfoundedness assumption; even moderate to large values of the sensitivity parameters do not attenuate the estimated negative effect.

This layered estimation and robustness-checking approach provides a template for empirical analysis in any setting with a salient post-treatment variable, noncompliance, or partial take-up scenarios.

6. Methodological Significance and Limitations

Principal stratification-based methods, as demonstrated in this analysis, provide several benefits:

Sharpening causal questions to meaningful subpopulations (principal strata) rather than ambiguous “average” effects in mixed populations.
Systematic handling of post-treatment variables, including non-use or partial compliance.
Rigorous adjustment for pre-exposure covariate imbalance via the propensity score, reducing model dependence and the risk of omitted variable bias.
Comprehensive sensitivity analysis strategies that explicitly quantify the impact of unmeasured confounding at both the stratum assignment and outcome level.

Limitations include:

Reliance on untestable identification assumptions (unconfoundedness, monotonicity) that, if strongly violated, may induce bias beyond the demonstrated sensitivity range.
The necessity to correctly specify the parametric forms of both the stratum and outcome models; model mis-specification remains a practical threat, although the propensity-score-based approach mitigates this risk.
The approach yields effect estimates local to principal strata (i.e., compliers), which may comprise a non-representative subset of the population.

7. Broader Applicability

The principal stratification framework is applicable whenever a key mediating, compliance, or usage variable stands between randomized/observed exposure and the outcome of interest, and especially where that variable is only observed post-treatment. The workflow and sensitivity tools demonstrated here set a rigorous standard for future economic, epidemiological, and social science studies aiming to disentangle direct causal mechanisms in the presence of intermediate post-treatment variables.

PDF Markdown Chat (Pro)

References (1)

Do debit cards decrease cash demand? Evidence from a causal analysis using Principal Stratification (2015)

Follow Topic

Get notified by email when new papers are published related to Principal Stratification-Based Method.