When Should You Adjust Standard Errors for Clustering?

Published 9 Oct 2017 in math.ST, econ.EM, and stat.TH | (1710.02926v4)

Abstract: In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically motivated by the concern that unobserved components of outcomes for units within clusters are correlated. However, this motivation does not provide guidance about questions such as: (i) Why should we adjust standard errors for clustering in some situations but not others? How can we justify the common practice of clustering in observational studies but not randomized experiments, or clustering by state but not by gender? (ii) Why is conventional clustering a potentially conservative "all-or-nothing" adjustment, and are there alternative methods that respond to data and are less conservative? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these questions using a framework of sampling and design inference. We argue that clustering can be needed to address sampling issues if sampling follows a two stage process where in the first stage, a subset of clusters are sampled from a population of clusters, and in the second stage, units are sampled from the sampled clusters. Then, clustered standard errors account for the existence of clusters in the population that we do not see in the sample. Clustering can be needed to account for design issues if treatment assignment is correlated with membership in a cluster. We propose new variance estimators to deal with intermediate settings where conventional cluster standard errors are unnecessarily conservative and robust standard errors are too small.