Minimax Estimation of Kernel Mean Embeddings (1602.04361v2)

Published 13 Feb 2016 in math.ST and stat.TH

Abstract: In this paper, we study the minimax estimation of the Bochner integral $$\mu_k(P):=\int_{\mathcal{X}} k(\cdot,x)\,dP(x),$$ also called as the kernel mean embedding, based on random samples drawn i.i.d.~from $P$, where $k:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R}$ is a positive definite kernel. Various estimators (including the empirical estimator), $\hat{\theta}n$ of $\mu_k(P)$ are studied in the literature wherein all of them satisfy $\bigl| \hat{\theta}_n-\mu_k(P)\bigr|{\mathcal{H}k}=O_P(n^{-1/2})$ with $\mathcal{H}_k$ being the reproducing kernel Hilbert space induced by $k$. The main contribution of the paper is in showing that the above mentioned rate of $n^{-1/2}$ is minimax in $|\cdot|{\mathcal{H}k}$ and $|\cdot|{L^{2(\mathbb{R}^d)}$-norms} over the class of discrete measures and the class of measures that has an infinitely differentiable density, with $k$ being a continuous translation-invariant kernel on $\mathbb{R}^d$. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of $P$ (if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in non-parametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance).

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Minimax Estimation of Kernel Mean Embeddings (1602.04361v2)

Summary

Related Papers