2000 character limit reached
An Optimal Uniform Concentration Inequality for Discrete Entropies on Finite Alphabets in the High-dimensional Setting (2007.04547v3)
Published 9 Jul 2020 in math.PR, cs.IT, math.IT, math.ST, and stat.TH
Abstract: We prove an exponential decay concentration inequality to bound the tail probability of the difference between the log-likelihood of discrete random variables on a finite alphabet and the negative entropy. The concentration bound we derive holds uniformly over all parameter values. The new result improves the convergence rate in an earlier result of Zhao (2020), from $(K2\log K)/n=o(1)$ to $ (\log K)2/n=o(1)$, where $n$ is the sample size and $K$ is the size of the alphabet. We further prove that the rate $(\log K)2/n=o(1)$ is optimal. The results are extended to misspecified log-likelihoods for grouped random variables. We give applications of the new result in information theory.