A comparative simulation study of data-driven methods for estimating density level sets (1402.0361v2)
Abstract: Density level sets are mainly estimated using one of three methodologies: plug-in, excess mass, or a hybrid approach. The plug-in methods are based on replacing the unknown density by some nonparametric estimator, usually the kernel. Thus, the bandwidth selection is a fundamental problem from a practical point of view. Recently, specific selectors for level sets have been proposed. However, if some a priori information about the geometry of the level set is available, then excess mass algorithms can be useful. In this case, a density estimator is not necessary, and the problem of bandwidth selection can be avoided. The third methodology is a hybrid of the others. As in the excess mass method, it assumes a mild geometric restriction on the level set and, like the plug-in approach, requires a pilot nonparametric estimator of the density. One interesting open question concerns the practical performance of these methods. In this work, existing methods are reviewed, and two new hybrid algorithms are proposed. Their practical behaviour is compared through extensive simulations.