Difference-in-Differences with Time-varying Continuous Treatments using Double/Debiased Machine Learning (2410.21105v1)

Published 28 Oct 2024 in econ.EM and stat.ML

Abstract: We propose a difference-in-differences (DiD) method for a time-varying continuous treatment and multiple time periods. Our framework assesses the average treatment effect on the treated (ATET) when comparing two non-zero treatment doses. The identification is based on a conditional parallel trend assumption imposed on the mean potential outcome under the lower dose, given observed covariates and past treatment histories. We employ kernel-based ATET estimators for repeated cross-sections and panel data adopting the double/debiased machine learning framework to control for covariates and past treatment histories in a data-adaptive manner. We also demonstrate the asymptotic normality of our estimation approach under specific regularity conditions. In a simulation study, we find a compelling finite sample performance of undersmoothed versions of our estimators in setups with several thousand observations.

Summary

The paper introduces an innovative DiD extension that estimates causal effects for non-zero, time-varying treatments.
It employs a semiparametric, kernel-based estimator with double/debiased machine learning to flexibly control high-dimensional covariates and treatment history.
Simulation studies and asymptotic analysis confirm the estimator's reliable performance, enabling robust policy evaluation in complex settings.

An Essay on Difference-in-Differences with Time-Varying Continuous Treatments Using Double/Debiased Machine Learning

The research paper introduces an innovative extension of the difference-in-differences (DiD) methodology tailored for assessing the causal effects of time-varying continuous treatments across multiple time periods. This extension is particularly critical in contexts where treatment intensities change over time, such as varying vaccination rates across regions, and when strictly zero-dose control groups are not feasible. The paper presents a kernel-based estimator that is enhanced with double/debiased machine learning (DML) to efficiently control for a large set of covariates and past treatment histories. The paper's robust simulation results and rigorous theoretical foundations highlight the method's potential for accurate treatment effect estimation in complex empirical settings.

Key Contributions

A notable feature of this paper is its introduction of a framework that identifies the average treatment effect on the treated (ATET) between two non-zero treatment doses, thereby expanding the conventional binary treatment to account for real-world situations where treatments can take continuous values. Identifying assumptions underpin the paper, requiring a conditional parallel trend on the mean potential outcome under a lower dose, conditional on covariates and past treatments. This approach is acknowledged to be more stringent than assumptions of parallel trends in the case of binary or zero-dose treatments, reflecting a trade-off in the methodological rigor needed to assess non-zero dose effects.

The paper introduces semiparametric kernel-based estimators for both repeated cross-section data—where subjects vary across time—and panel data, which tracks the same subjects over time. The adaptation to machine learning methodology, specifically utilizing DML as originally described by Chernozhukov et al., allows for flexibility when dealing with high-dimensional datasets often present in applied research. The DML approach incorporates cross-fitting to prevent overfitting and to provide robustness to various model misspecifications.

Asymptotic Properties and Simulation

The authors address the asymptotic properties of the proposed estimators, ensuring asymptotic normality under certain regularity conditions. These properties guarantee that as the sample size increases, the estimators' distribution approaches a normal distribution, which is crucial for reliable inference. The paper also includes simulation studies that demonstrate the compelling finite sample performances of undersmoothed versions of the estimators, suggesting practical superiority in settings with several thousand observations.

Implications and Future Directions

The implications of this research are extensive, offering a robust methodological tool to assess causal effects in scenarios with continuous treatments and time variation. In practical terms, the research provides a reliable framework for policymakers and analysts to evaluate varying intensities of interventions, such as public health measures or economic policies, beyond binary classifications.

Theoretically, the paper advances the literature on causal inference by bridging gaps between established statistical methodologies and modern machine learning techniques. This contribution opens avenues for future research to address settings involving more complex treatment dimensions, nonlinearities, or interactions with other treatment modalities.

Overall, the paper represents a significant methodological enhancement for empirical researchers handling continuous, time-varying treatments, and it sets the stage for further exploration of causal inference under these conditions. The innovative combination of traditional econometric principles with state-of-the-art machine learning techniques marks a substantial progression in the field of causal inference. Future research could potentially extend these methodologies to accommodate more nuanced temporal dynamics or incorporate deeper learning architectures to further augment predictive accuracy and causal insights.

PDF Markdown

Related Papers

Tweets

https://twitter.com/CausalHuber/status/1852011564842283054

https://twitter.com/CapybaraPapers/status/1852456848126800231