Universal consistency of Wasserstein $k$-NN classifier: Negative and Positive Results
Abstract: The Wasserstein distance provides a notion of dissimilarities between probability measures, which has recent applications in learning of structured data with varying size such as images and text documents. In this work, we study the $k$-nearest neighbor classifier ($k$-NN) of probability measures under the Wasserstein distance. We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$. As any Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of $\sigma$-finite metric dimension, we show that the $k$-NN classifier is universally consistent on spaces of measures supported in a $\sigma$-uniformly discrete set. In addition, by studying the geodesic structures of the Wasserstein spaces for $p=1$ and $p=2$, we show that the $k$-NN classifier is universally consistent on the space of measures supported on a finite set, the space of Gaussian measures, and the space of measures with densities expressed as finite wavelet series.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.