Compliance-Weighted LATE (CWLATE)

Updated 8 February 2026

CWLATE is a method that weights subgroup treatment effects based on empirical or modeled compliance intensity to better capture robust causal effects.
It enhances identification by focusing on regions in the data where the instrument strongly influences treatment take-up, using flexible techniques like ML for compliance estimation.
The estimator employs weighted IV and GMM approaches, delivering improved precision in environments with heterogeneous and weak compliance.

The compliance-weighted local average treatment effect (CWLATE) generalizes the conventional LATE by weighting individual or subgroup-specific causal effects according to empirical or model-derived compliance intensities. CWLATE thus targets an estimand that (i) prioritizes strata, covariate regions, or time periods where the instrument has maximal exogenous impact on treatment take-up; (ii) achieves stronger identification or inferential robustness; and (iii) aligns the average causal effect definition with underlying data informativeness, even in settings with weak or heterogeneous instrumental variable (IV) first stage.

1. Definitional and Theoretical Foundations

The canonical IV/LATE model considers observed outcome $Y$ , binary treatment $D\in\{0,1\}$ , and exogenous instrument $Z$ . Let $Y_0,Y_1$ denote potential outcomes and $D(0),D(1)$ potential treatment under alternate instrument realizations. The “complier” population comprises units with $D(1)>D(0)$ under the standard monotonicity (no-defiers) restriction.

CWLATE for a covariate vector $X$ or latent selection index $U$ is a weighted mean of compliers' treatment effects, with the weight function typically proportional to the strength of instrumental compliance:

In MTE models, $U\sim Uniform[0,1]$ is a latent selection index, and the marginal treatment effect $MTE(u)=E[Y_1-Y_0|U=u]$ ; for two instrument values inducing first-stage probabilities $p_0<p_1$ , the relevant CWLATE is

$CWLATE(z_{low},z_{high}) = \int_{p_0}^{p_1} MTE(u) du / (p_1 - p_0)$

or equivalently, using compliance weights $w(u)=1/(p_1-p_0) \cdot \mathbf{1}_{[p_0,p_1]}(u)$ , $CWLATE = \int_0^1 w(u) MTE(u) du$ (Ren, 29 Dec 2025).

In covariate-dependent scenarios, CWLATE generalizes to

$\tau_{CWLATE} = \frac{E[\alpha(X)\, (Y(1)-Y(0))]}{E[\alpha(X)]}$

where $\alpha(X)=P[D(1)>D(0)|X]$ is the local compliance probability (Coussens et al., 2021, Argañaraz et al., 2024).

In fuzzy RDD with covariates, cell-specific LATEs $\beta(w_j) = \delta_Y(w_j)/\delta_X(w_j)$ are aggregated with squared-first-stage weights, i.e.

$\beta_{CW} = \frac{\sum_{j=1}^m \pi_j \delta_X(w_j)^2 \beta(w_j)}{\sum_j \pi_j \delta_X(w_j)^2} = \frac{E[\delta_X(W)^2 \beta(W)]}{E[\delta_X(W)^2]}$

where $\delta_X(w_j)$ quantifies first-stage jumps at the cutoff for covariate cell $w_j$ (Caetano et al., 1 Feb 2026).

Across these formulations, CWLATE weights are generally nonnegative and concentrate on subpopulations where the IV is most informative.

2. Identification Assumptions and Compliance Weighting

CWLATE identification requires standard IV conditions—exogeneity of $Z$ , exclusion, and monotonicity—with additional conditions tailored to each context:

Overlapping Support: The compliance-weighted region (e.g., $u \in [p_0,p_1]$ ) must be estimable from variation in $Z$ . In discrete IV settings, unique weights are only supported where $p(z)$ changes (Ren, 29 Dec 2025).
Weak/Strong Monotonicity: Under strong monotonicity, compliance weights (e.g., $\alpha(X)$ ) are nonnegative, ensuring convex combinations of local effects (Słoczyński, 2020). Under weak monotonicity, some linear IV formulations may induce negative weights, complicating interpretation, unless fully interacted or squared compliance weights are used (Słoczyński, 2020, Caetano et al., 1 Feb 2026).
Minimal Relevance: CWLATE remains well-defined and point-identified as long as the instrument induces nontrivial local compliance (i.e., $Var(E[D|Z,X]|X)>0$ ) (Argañaraz et al., 2024). This is a weaker requirement than the full-rank condition needed by classical Wald-style estimators.

A practical implication is that CWLATE can be constructed and interpreted even under weak instrument conditions or in the presence of considerable first-stage heterogeneity, as it naturally down-weights nearly noncompliant strata (Ren, 29 Dec 2025, Oprescu et al., 2024).

3. Estimation Methodologies and Inference

CWLATE estimation requires two key components:

Estimation of Compliance Scores: Estimation of $\alpha(X)$ or the relevant compliance function is crucial. These may be fit parametrically (e.g., logistic regression, cell binning) or with flexible ML tools (causal forests, treatment effect learners) using cross-fitted samples for valid inference (Coussens et al., 2021, Argañaraz et al., 2024).
Weighted IV or GMM Estimation: The main estimator takes the form of a weighted Wald or two-stage least squares, with compliance probabilities serving as weights. For instance,

$\hat\tau_{CWLATE} = \frac{\sum_i w(X_i) (Z_i-p) Y_i}{\sum_i w(X_i) (Z_i-p) D_i}$

with $w(X_i)=\hat\alpha(X_i)$ for each observation, and the denominator re-centering makes the estimator robust to intercept shifts (Coussens et al., 2021).

Inference for CWLATE features both frequentist and robust approaches:

Asymptotic Normality: Under appropriate regularity, $\sqrt{n}$ -asymptotics and heteroskedasticity-robust errors (using the plug-in or influence function) are valid even when weights are estimated nonparametrically (Coussens et al., 2021, Słoczyński et al., 2022).
Weak IV Robustness: In cases where compliance is weak or the instrument barely affects $D$ in some regions, classical Wald intervals may under- or overcover. Modified conditional Wald or robust MLC tests (with AR-type statistics and orthogonalization against nuisance components) achieve uniformly valid confidence intervals for all linear functionals of MTE, including CWLATE (Ren, 29 Dec 2025).

For designs such as fuzzy RDD, the estimator stacks cell-weighted local polynomial regressions and applies robust bias correction (RBC) for local linear estimators (Caetano et al., 1 Feb 2026).

4. Extensions: Heterogeneity, Machine Learning, and Data Combination

CWLATE natively facilitates treatment effect heterogeneity analysis. The estimator can be expressed as a compliance-weighted aggregation of covariate-conditional LATEs or CATEs:

$\tau_{CWLATE} = \int w(x) \tau_{LATE}(x) d\mu(x)$

where $w(x)$ is proportional to $Var(E[D|Z,x]|x)$ or other compliance metrics, and $\mu(x)$ is the covariate measure (Argañaraz et al., 2024). In high-dimensional designs, coarsening $X$ may be advisable for estimator stability (Caetano et al., 1 Feb 2026).

Machine learning debiasing further improves robustness by constructing orthogonal moment conditions insensitive to first-stage estimation error. For example, the Compliance Machine Learning estimator (CML) yields locally robust CWLATE estimates with minimal relevance and remains interpretable as a convex average over conditional LATEs (Argañaraz et al., 2024). Cross-fitting and orthogonalization are required for inferential validity in these ML-augmented settings.

Combination of observational and IV datasets is possible via compliance-weighted shrinkage estimators:

Observational CATEs are first estimated (with possible confounding bias).
IV-based corrections are applied using compliance-weighted smoothing or two-stage pseudo-outcome regression, reducing variance in weak or zero-compliance regions (Oprescu et al., 2024, Shinoda et al., 2021).

Weighted least squares (WLS) formulations solve for regression coefficients minimizing compliance-weighted prediction error, with compliance probabilities in the numerator of the weight function to ensure numerical stability (Shinoda et al., 2021).

5. Special Cases: Design Variants and Dynamic Treatments

Fuzzy Regression Discontinuity Designs (RDD): In classical fuzzy RDD with covariates, CWLATE is the unique WLATE maximizing first-stage strength, focusing estimation on strata with maximal discontinuity in treatment probability at the cutoff. This property yields improved estimator precision, especially in the presence of heterogeneous compliance (Caetano et al., 1 Feb 2026).
Dynamic Treatments with Static IV: When treatments are assigned over time with a single static instrument, period-specific IV estimands mix dynamic treatment effects for various latent groups, often with negative weights due to compliance type switching. CWLATE can be constructed as a convex combination of dynamic LATEs for first-period compliers, using compliance rates at each horizon as weights, ensuring nonnegative and interpretable aggregation (Ferman et al., 2023).

6. Practical Implementation and Empirical Performance

Critical implementation steps across settings include:

Estimation of compliance functions $\alpha(X)$ via cross-fitted ML or stratification.
Computation of sample weights for each unit or cell.
Fitting weighted IV, GMM, or local regression models using these weights.
Robust standard error calculation via sandwich, influence function, or empirical bootstrap methods (Coussens et al., 2021, Słoczyński et al., 2022, Ren, 29 Dec 2025).

Software exemplars such as the Stata package kappalate operationalize CWLATE and normalized IPW variants with analytic error formulas and covariate balancing options (Słoczyński et al., 2022).

Empirical studies demonstrate that CWLATE estimators offer lower mean squared error and more stable inferences than unweighted alternatives, particularly when compliance is highly variable across observable strata (Coussens et al., 2021, Caetano et al., 1 Feb 2026). Applications include recidivism (assigned prosecutor instrument), Medicaid expansion, and cash transfer programs (Ren, 29 Dec 2025, Argañaraz et al., 2024, Caetano et al., 1 Feb 2026).

7. Interpretational Considerations and Limitations

CWLATE should be interpreted as the average treatment effect among the population (or strata) where treatment take-up is most impacted by the instrument. When monotonicity fails or the IV is weak in some regions, the estimand re-weights toward informative cells, but may not coincide with the unconditional ATE or conventional LATE estimand (Słoczyński, 2020, Ren, 29 Dec 2025). Misspecification in first-stage modeling can induce negative or ill-behaved weights in some linear IV approaches; preference should be given to methods with squared or normalized weights (e.g., fully interacted IV, squared first-stage weighting) that guarantee nonnegativity and interpretability.

In summary, CWLATE defines and identifies a class of estimands and estimators that prioritize causal identification where the IV is most effective, yielding robust, precise effects estimates in the presence of compliance heterogeneity, weak instruments, and complex treatment effect structure (Ren, 29 Dec 2025, Coussens et al., 2021, Caetano et al., 1 Feb 2026, Argañaraz et al., 2024).