2000 character limit reached
A Dimension-Independent discriminant between distributions (1802.04497v1)
Published 13 Feb 2018 in cs.IT, math.IT, math.ST, and stat.TH
Abstract: Henze-Penrose divergence is a non-parametric divergence measure that can be used to estimate a bound on the Bayes error in a binary classification problem. In this paper, we show that a cross-match statistic based on optimal weighted matching can be used to directly estimate Henze-Penrose divergence. Unlike an earlier approach based on the Friedman-Rafsky minimal spanning tree statistic, the proposed method is dimension-independent. The new approach is evaluated using simulation and applied to real datasets to obtain Bayes error estimates.