Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Doubly Robust Criterion for Causal Inference (2110.14525v3)

Published 27 Oct 2021 in stat.ME

Abstract: The semiparametric estimation approach, which includes inverse-probability-weighted and doubly robust estimation using propensity scores, is a standard tool in causal inference, and it is rapidly being extended in various directions. On the other hand, although model selection is indispensable in statistical analysis, an information criterion for selecting an appropriate regression structure has just started to be developed. In this paper, based on the original definition of Akaike information criterion (AIC; \citealt{Aka73}), we derive an AIC-type criterion for propensity score analysis. Here, we define a risk function based on the Kullback-Leibler divergence as the cornerstone of the information criterion and treat a general causal inference model that is not necessarily a linear one. The causal effects to be estimated are those in the general population, such as the average treatment effect on the treated or the average treatment effect on the untreated. In light of the fact that this field attaches importance to doubly robust estimation, which allows either the model of the assignment variable or the model of the outcome variable to be wrong, we make the information criterion itself doubly robust so that either one can be wrong and it will still be an asymptotically unbiased estimator of the risk function. In simulation studies, we compare the derived criterion with an existing criterion obtained from a formal argument and confirm that the former outperforms the latter. Specifically, we check that the divergence between the estimated structure from the derived criterion and the true structure is clearly small in all simulation settings and that the probability of selecting the true or nearly true model is clearly higher. Real data analyses confirm that the results of variable selection using the two criteria differ significantly.

Summary

We haven't generated a summary for this paper yet.