2000 character limit reached
On Semi-Supervised Estimation of Distributions (2305.07955v2)
Published 13 May 2023 in math.ST, cs.IT, math.IT, and stat.TH
Abstract: We study the problem of estimating the joint probability mass function (pmf) over two random variables. In particular, the estimation is based on the observation of $m$ samples containing both variables and $n$ samples missing one fixed variable. We adopt the minimax framework with $lp_p$ loss functions, and we show that the composition of uni-variate minimax estimators achieves minimax risk with the optimal first-order constant for $p \ge 2$, in the regime $m = o(n)$.