Stacked DiD Event Study Methodology

Updated 2 June 2026

Stacked DiD event study is a methodological framework that constructs cohort-specific 2×2 DiD contrasts to estimate dynamic treatment effects in panel data.
It realigns calendar dates into event time, enabling clear comparisons of pre‐ and post-treatment outcomes across units with staggered treatment timings.
Extensions such as normalization, covariate balancing, and optimal weighting improve bias control and facilitate robust causal inference.

A stacked Difference-in-Differences (DiD) event study is a methodological framework for estimating intertemporal treatment effects in panel data settings where treatment timing is staggered, treatment may be non-binary or non-absorbing, and outcomes may exhibit dynamic or lagged effects. It addresses limitations of conventional two-way fixed-effects (TWFE) and local projection estimators by constructing cohort-specific 2×2 DiD contrasts and aggregating these with transparent weighting, thereby yielding estimators with robust causal interpretation under clear parallel-trends assumptions.

1. Model Setup and Potential Outcomes

The stacked DiD event study requires panel data $\{(Y_{i,t}, D_{i,t}): i=1,\ldots,N;\; t=1,\ldots,T\}$ , where $D_{i,t}$ is a possibly multi-level or non-absorbing treatment. The potential outcome framework defines $Y_{i,t}(d_{1},\ldots,d_{T})$ as the outcome for unit $i$ under treatment path $(d_{1},\ldots,d_{T})$ , with observed $Y_{i,t}=Y_{i,t}(D_{i,1},\ldots,D_{i,T})$ . Treatment need not be binary or permanent; $D_{i,t}$ can vary in $\mathbb{R}^+$ .

The no-anticipation assumption states that $Y_{i,t}(d_{1},\ldots,d_{T}) = Y_{i,t}(d_{1},\ldots,d_{t})$ , i.e., potential outcomes at time $t$ depend only on treatment up to that period. The key identifying assumption is parallel trends for the status-quo outcome: for any two units $D_{i,t}$ 0 with the same baseline dose $D_{i,t}$ 1,

$D_{i,t}$ 2

for all $D_{i,t}$ 3, where $D_{i,t}$ 4-path denotes the status-quo (no-switch) path.

2. Event Time, Cohort Structure, and Data Stacking

Let $D_{i,t}$ 5, the first period when unit $D_{i,t}$ 6's treatment changes. Event time for unit $D_{i,t}$ 7 is defined by $D_{i,t}$ 8, with $D_{i,t}$ 9 indicating the first post-switch period. Each value of $Y_{i,t}(d_{1},\ldots,d_{T})$ 0 defines a cohort; stacking converts all cohort's calendar dates to event time indexed by $Y_{i,t}(d_{1},\ldots,d_{T})$ 1.

Stacking constructs a pooled dataset for estimation by realigning the time axis of observations based on treatment adoption events rather than calendar time. This enables the comparison of pre- and post-treatment outcomes across units with different treatment timings, supporting dynamic effect estimation.

3. Stacked DiD Event-Study Estimator Construction

For each unit $Y_{i,t}(d_{1},\ldots,d_{T})$ 2 and event time $Y_{i,t}(d_{1},\ldots,d_{T})$ 3, define:

$Y_{i,t}(d_{1},\ldots,d_{T})$ 4, $Y_{i,t}(d_{1},\ldots,d_{T})$ 5
$Y_{i,t}(d_{1},\ldots,d_{T})$ 6
The control pool consists of all $Y_{i,t}(d_{1},\ldots,d_{T})$ 7 with $Y_{i,t}(d_{1},\ldots,d_{T})$ 8 and $Y_{i,t}(d_{1},\ldots,d_{T})$ 9.

The unit-level DiD is: $i$ 0 Aggregating over all eligible units at event time $i$ 1: $i$ 2 Under the maintained assumptions, $i$ 3 identifies the average effect of $i$ 4 periods of exposure to a non-baseline treatment.

4. Extensions: Normalization, Weighting, and Covariate Balancing

When treatment is non-binary, normalization is required for interpretable effect scaling: $i$ 5 The normalized effect at $i$ 6: $i$ 7 Aggregated as

$i$ 8

where $i$ 9 is the mean absolute increment.

Recent refinements address covariate imbalance between treated and controls within sub-experiments. The Covariate-Balanced Weighted Stacked DiD (CBWSDID) estimator implements initial matching or weighting (e.g., propensity scores, entropy balancing, or nearest-neighbor matching) within each cohort-event-time stratum, followed by corrective reweighting across sub-experiments to ensure that the final estimator targets the aggregate ATT under covariate-conditional parallel trends (Ustyuzhanin, 2 Apr 2026).

5. Comparison to Other Event-Study Estimators

TWFE event-study estimators, based on group and time fixed effects plus event-time dummies, can produce biased or misleading estimates in the presence of heterogeneous treatment effects, as their weights on dynamic causal effects can be negative or non-convex (Chaisemartin et al., 2020). Panel-data local projections (regressions of $(d_{1},\ldots,d_{T})$ 0 on $(d_{1},\ldots,d_{T})$ 1 plus FEs) similarly mix effects across event times and can also yield contaminated or attenuated coefficients.

Stacked DiD event studies maintain clear causal interpretation by contrasting each post-treatment period with only not-yet-treated synthetic controls, ensuring identification robustness. They also permit explicit organization of placebo checks (using pre-event leads) to test for parallel pre-trends.

Efficient extensions utilize semiparametric influence functions and optimal weighting for variance minimization, leveraging overidentifying moment restrictions and achieving the semiparametric efficiency bound under regularity conditions (Chen et al., 21 Jun 2025). Imputation-based estimators further recast the event-study as estimating the difference between actual and counterfactual outcomes predicted from untreated samples with unit and time fixed effects (Borusyak et al., 2021).

6. Implementation, Inference, and Diagnostic Procedures

A typical implementation comprises:

Calculation of cohort indicators ( $(d_{1},\ldots,d_{T})$ 2), event times ( $(d_{1},\ldots,d_{T})$ 3), construction of treated and control samples in the event window.
Estimation of cohort-event-time DiD contrasts, with optional normalization or covariate balancing.
Aggregation with appropriately derived weights, possibly including CBWSDID corrective weights.
Inference via cluster-robust or bootstrap standard errors. Analytic variance estimators based on influence functions and multiplier bootstrapping for simultaneous/confidence bands are now standard (Uhr et al., 29 Apr 2026).
Placebo tests/exclusionary pre-trend regressions, often fitting robust OLS models on untreated samples and, if required, leave-one-out variance estimation for conservative coverage (Borusyak et al., 2021).

Empirical applications frequently compare stacked DiD estimators to conventional TWFE, weighted or matched panel matching, and local-projection methods, typically demonstrating reduced bias, recovery of dynamic ATT, and credible inference even under deviations from strong parallel trends.

7. Applications, Extensions, and Simulation Evidence

Stacked DiD event-study methods have been applied across policy evaluation, labor, macro, and environmental economics. For repeated treatment settings, extensions allow for absorbing and non-absorbing (switching on/off) episodes, under finite-memory assumptions and episode-specific parallel trends (Ustyuzhanin, 2 Apr 2026). Simulations consistently reveal that design-based covariate adjustment, coupled with corrective weighting, can eliminate pre-trend bias and restore uniform coverage for long-run effects.

Available software (e.g., cbwsdid in R) streamlines implementation of these estimators and supports design diagnostics. Empirical results confirm substantial improvements in bias, precision, and pre-trend control relative to unadjusted or TWFE approaches, especially when parallel trends are only plausible after covariate adjustment or matching (Ustyuzhanin, 2 Apr 2026).