Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-Objective Weighted Sampling

Published 24 Sep 2015 in cs.DB and cs.DS | (1509.07445v6)

Abstract: {\em Multi-objective samples} are powerful and versatile summaries of large data sets. For a set of keys $x\in X$ and associated values $f_x \geq 0$, a weighted sample taken with respect to $f$ allows us to approximate {\em segment-sum statistics} $\text{Sum}(f;H) = \text{sum}_{x\in H} f_x$, for any subset $H$ of the keys, with statistically-guaranteed quality that depends on sample size and the relative weight of $H$. When estimating $\text{Sum}(g;H)$ for $g\not=f$, however, quality guarantees are lost. A multi-objective sample with respect to a set of functions $F$ provides for each $f\in F$ the same statistical guarantees as a dedicated weighted sample while minimizing the summary size. We analyze properties of multi-objective samples and present sampling schemes and meta-algortithms for estimation and optimization while showcasing two important application domains. The first are key-value data sets, where different functions $f\in F$ applied to the values correspond to different statistics such as moments, thresholds, capping, and sum. A multi-objective sample allows us to approximate all statistics in $F$. The second is metric spaces, where keys are points, and each $f\in F$ is defined by a set of points $C$ with $f_x$ being the service cost of $x$ by $C$, and $\text{Sum}(f;X)$ models centrality or clustering cost of $C$. A multi-objective sample allows us to estimate costs for each $f\in F$. In these domains, multi-objective samples are often of small size, are efficiently to construct, and enable scalable estimation and optimization. We aim here to facilitate further applications of this powerful technique.

Citations (22)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.