Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sampling Without Compromising Accuracy in Adaptive Data Analysis (1709.09778v3)

Published 28 Sep 2017 in cs.LG and cs.DS

Abstract: In this work, we study how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms. This is important to do when both the datasets and the number of queries asked are very large. In particular, we describe a mechanism that provides a polynomial speed-up per query over previous mechanisms, without needing to increase the total amount of data required to maintain the same generalization error as before. We prove that this speed-up holds for arbitrary statistical queries. We also provide an even faster method for achieving statistically-meaningful responses wherein the mechanism is only allowed to see a constant number of samples from the data per query. Finally, we show that our general results yield a simple, fast, and unified approach for adaptively optimizing convex and strongly convex functions over a dataset.

Citations (7)

Summary

We haven't generated a summary for this paper yet.