Causal Models for Longitudinal and Panel Data: A Survey (2311.15458v3)

Published 26 Nov 2023 in econ.EM

Abstract: In this survey we discuss the recent causal panel data literature. This recent literature has focused on credibly estimating causal effects of binary interventions in settings with longitudinal data, emphasizing practical advice for empirical researchers. It pays particular attention to heterogeneity in the causal effects, often in situations where few units are treated and with particular structures on the assignment pattern. The literature has extended earlier work on difference-in-differences or two-way-fixed-effect estimators. It has more generally incorporated factor models or interactive fixed effects. It has also developed novel methods using synthetic control approaches.

Citations (11)

View on Semantic Scholar

Summary

The paper presents key innovations that extend traditional DID and TWFE methods to accommodate dynamic and heterogeneous treatment effects.
It highlights synthetic control methods as robust alternatives when standard parallel trends assumptions do not hold.
The survey discusses matrix completion and factor models that capture complex temporal dependencies in panel data analysis.

Causal Models for Longitudinal and Panel Data: An In-Depth Overview

The paper "Causal Models for Longitudinal and Panel Data: A Survey" by Dmitry Arkhangelsky and Guido Imbens provides a comprehensive review of the causal inference landscape within longitudinal and panel data settings. The focus is predominantly on estimating causal effects under binary interventions with recent methodological expansions aimed at accommodating treatment effect heterogeneity and alternative methodological paradigms beyond traditional Difference-In-Differences (DID) and Two-Way-Fixed-Effect (TWFE) estimators.

Summary and Key Contributions

Arkhangelsky and Imbens's survey meticulously covers several lines of inquiry that address both traditional and modern approaches to causal estimation. The authors identify that much of the recent literature diverges from classic TWFE models by integrating techniques that encompass factor models, synthetic control methods, and other innovative approaches. The survey shines a critical light on existing assumptions and methodologies, addressing emergent criticisms and evolving practices.

Extension of Traditional Methods: Traditional DID and TWFE methods, while broadly used, have been scrutinized for their rigid assumptions regarding constant treatment effects and parallel trends. This paper identifies recent extensions to these methods that allow users to better account for dynamic and heterogeneous effects, particularly in staggered adoption contexts where units adopt treatments at varying times.
Synthetic Control Approaches: A focal point of the paper is the emergence of synthetic control methods, which create weighted combinations of control units to better estimate treatment effects. This method is particularly emphasized as an alternative to TWFE in cases with small sample sizes or when traditional parallel trends are unlikely to hold.
Matrix Completion and Factor Models: The paper discusses matrix completion techniques, which leverage low-rank approximations and factor models to better approximate counterfactual outcomes, considering potential dynamic dependencies among time-series data. These methods relax the TWFE framework by accommodating richer patterns of correlation across units and times.
Nonlinear and Hybrid Methods: The authors explore non-linear extensions and hybrid approaches, such as Synthetic Difference-In-Differences (SDID) and Augmented Synthetic Control, which offer improvements by incorporating flexibility in modeling outcomes while maintaining robustness against treatment effect heterogeneity.
Design-Based Approaches: Design-based estimators focus on the uncertainty arising from the assignment mechanism rather than sampling variance. The authors highlight growing interest in these models particularly in contexts where units vary dynamically in their probabilities of receiving treatment.

Theoretical and Practical Implications

Arkhangelsky and Imbens's work has notable theoretical implications, stressing the significance of assumption-based modeling compared to treatment heterogeneity handling. They argue that in contexts featuring staggered adoption or when dynamic effects are implicitly assumed absent, methods should be selected and applied based on their adaptability to specific data characteristics and research objectives.

Practically, the survey underscores the pivotal importance of selecting appropriate estimators based on the data configuration (panel, repeated cross-section, etc.), unit-treatment-time dynamics, and the intended causal estimand. By contrasting various methods under conditions of staggered adoption and dynamic treatment effect heterogeneity, researchers are guided to adapt more flexible models that echo the complexity of real-world data.

Areas for Future Research

The paper closes by pointing toward several promising research directions. These include further analysis of methods accounting for dynamic treatment effects and improved strategies for combining experimental and observational data. Moreover, the exploration of validation techniques and continuous treatment settings in the context of causal inference marks promising avenues warranting deeper analytical engagement.

Conclusion

Arkhangelsky and Imbens provide nuanced insights and essential critiques of contemporary panel data methodologies, offering a pivotal resource for the continued evolution of causal inference techniques. The paper, with its detailed survey of robust, sophisticated, and empirically relevant approaches, stands as an invaluable resource for researchers seeking to navigate and implement these advanced methodological frameworks in longitudinal and panel data analysis.