Variance Competitiveness for Monotone Estimation: Tightening the Bounds (1406.6490v1)
Abstract: Random samples are extensively used to summarize massive data sets and facilitate scalable analytics. Coordinated sampling, where samples of different data sets "share" the randomization, is a powerful method which facilitates more accurate estimation of many aggregates and similarity measures. We recently formulated a model of {\it Monotone Estimation Problems} (MEP), which can be applied to coordinated sampling, projected on a single item. MEP estimators can then be used to estimate sum aggregates, such as distances, over coordinated samples. For MEP, we are interested in estimators that are unbiased and nonnegative. We proposed {\it variance competitiveness} as a quality measure of estimators: For each data vector, we consider the minimum variance attainable on it by an unbiased and nonnegative estimator. We then define the competitiveness of an estimator as the maximum ratio, over data, of the expectation of the square to the minimum possible. We also presented a general construction of the L$*$ estimator, which is defined for any MEP for which a nonnegative unbiased estimator exists, and is at most 4-competitive. Our aim here is to obtain tighter bounds on the {\em universal ratio}, which we define to be the smallest competitive ratio that can be obtained for any MEP. We obtain an upper bound of 3.375, improving over the bound of $4$ of the L$*$ estimator. We also establish a lower bound of 1.44. The lower bound is obtained by constructing the {\it optimally competitive} estimator for particular MEPs. The construction is of independent interest, as it facilitates estimation with instance-optimal competitiveness.