A Survey on Causal Inference (2002.02770v1)

Published 5 Feb 2020 in stat.ME, cs.AI, cs.LG, and stat.ML

Abstract: Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Citations (447)

View on Semantic Scholar

Summary

The paper provides a comprehensive review of causal inference methods to estimate treatment effects from observational data when randomized trials are impractical.
It categorizes techniques into re-weighting, stratification and matching, tree-based algorithms, representation learning, and advanced multi-task approaches to mitigate selection bias.
The survey underscores the practical significance of these methods in fields like healthcare, advertising, and policy-making, paving the way for future advancements.

Overview of "A Survey on Causal Inference"

The paper "A Survey on Causal Inference," authored by Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang, provides a thorough exploration of causal inference methodologies, a topic that forms the bedrock of decision-making in fields ranging from statistics and computer science to public policy and economics. As randomized controlled trials (RCTs) often become impractical due to cost and ethical concerns, estimating causal effects from observational data has gained significance. This paper explores this domain, reviewing the potential outcome framework and categorizing causal inference methods based on the fulfiLLMent of its core assumptions.

Categorization of Methods

The paper organizes causal inference strategies into those that are contingent on all assumptions of the potential outcome framework (SUTVA, ignorability, and positivity), and those that modify these assumptions. For methods reliant on these assumptions, the review spans traditional statistics and machine learning enhancements across several methodological classes:

Re-weighting Methods: These include the inverse propensity weighting (IPW), doubly robust estimators, and extensions like covariate balancing propensity scores, all aimed at mitigating selection bias through sample re-weighting.
Stratification and Matching: These involve subdividing or matching data to approximate an RCT setting within the observational data.
Tree-Based Algorithms: Decision trees and ensemble methods, such as Bayesian Additive Regression Trees, which capture heterogeneities in treatment effects.
Representation Learning: Techniques to transform covariates into balanced spaces that mitigate selection biases and allow for improved causal inference.
Multi-task and Meta-Learning Approaches: These focus on leveraging shared and unique patterns across treatment conditions or utilizing robust learners to cater to estimation bias.

For scenarios where assumptions like unconfoundedness are untenable, the paper surveys advanced methodologies such as dealing with unobserved confounding through network proxies or integrating experimental with observational data.

Practical and Theoretical Implications

The review underscores the pivotal role of causal inference in real-world applications such as advertising, medicine, and policy-making. Understanding causal effects can significantly improve personalized recommendations, optimize advertisement impacts, and guide therapeutic decisions in healthcare. The paper also draws attention to the synergy between machine learning advancements and causal inference, posing significant implications for future developments in robust AI systems.

Speculations on Future Developments

Future explorations in causal inference are likely to explore handling complex, high-dimensional data with unobserved confounders, particularly using sophisticated machine learning techniques such as deep learning and reinforcement learning. The paper lays a foundation for continued advancement towards more accurate and computationally efficient causal inference techniques that cater to diverse real-world challenges.

The paper ultimately serves as a critical resource for researchers in causal inference, providing them with a comprehensive landscape of methodologies and their applications. It highlights both established principles and emerging innovations, setting a trajectory for the continued evolution of causal analysis in data-driven decision-making.

PDF Markdown