Kernel Density Estimation with Berkson Error (1401.3362v4)
Abstract: Given a sample ${X_i}{i=1}n$ from $f_X$, we construct kernel density estimators for $f_Y$, the convolution of $f_X$ with a known error density $f{\epsilon}$. This problem is known as density estimation with Berkson error and has applications in epidemiology and astronomy. Little is understood about bandwidth selection for Berkson density estimation. We compare three approaches to selecting the bandwidth both asymptotically, using large sample approximations to the MISE, and at finite samples, using simulations. Our results highlight the relationship between the structure of the error $f_{\epsilon}$ and the optimal bandwidth. In particular, the results demonstrate the importance of smoothing when the error term $f_{\epsilon}$ is concentrated near 0. We propose a data--driven bandwidth estimator and test its performance on NO$_2$ exposure data.