Density Functional Estimators with k-Nearest Neighbor Bandwidths (1702.03051v1)
Abstract: Estimating expected polynomials of density functions from samples is a basic problem with numerous applications in statistics and information theory. Although kernel density estimators are widely used in practice for such functional estimation problems, practitioners are left on their own to choose an appropriate bandwidth for each application in hand. Further, kernel density estimators suffer from boundary biases, which are prevalent in real world data with lower dimensional structures. We propose using the fixed-k nearest neighbor distances for the bandwidth, which adaptively adjusts to local geometry. Further, we propose a novel estimator based on local likelihood density estimators, that mitigates the boundary biases. Although such a choice of fixed-k nearest neighbor distances to bandwidths results in inconsistent estimators, we provide a simple debiasing scheme that precomputes the asymptotic bias and divides off this term. With this novel correction, we show consistency of this debiased estimator. We provide numerical experiments suggesting that it improves upon competing state-of-the-art methods.