Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Investigating Methods for Weighted Reservoir Sampling with Replacement (2403.20256v4)

Published 29 Mar 2024 in cs.DS

Abstract: Reservoir sampling techniques can be used to extract a sample from a population of unknown size, where units are observed sequentially. Most of attention has been placed to sampling without replacement, with only a small number of studies focusing on sampling with replacement. In this paper, we clarify some statements appearing in the literature about the reduction of reservoir sampling with replacement to single reservoir sampling without replacement, exploring in detail how to deal with the weighted case. Then, we demonstrate that the results shown in Park et al. (2004) can be further generalized to develop a skip-based algorithm more efficient than previous methods, and, additionally, we provide a single-pass merging strategy which can be executed on multiple streams in parallel. Finally, we establish that the skip-based algorithm is faster than standard methods when used to extract a single sample from the population in a non-streaming scenario when the sample ratio is approximately less than 10% of the population.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com