Papers
Topics
Authors
Recent
2000 character limit reached

Maintaining Discrete Probability Distributions in Practice

Published 11 Feb 2023 in cs.DS | (2302.05682v2)

Abstract: A classical problem in random number generation is the sampling of elements from a given discrete distribution. Formally, given a set of indices $S = {1, \dots, n}$ and sequence of weights $w_1, \dots, w_n \in \mathbb{R}+$, the task is to provide samples from $S$ with distribution $p(i) = w_i / W$ where $W = \sum_j w_j$. A commonly accepted solution is Walker's Alias Table, which allows for each sample to be drawn in constant time. However, some applications correspond to a dynamic setting, where elements are inserted or removed, or weights change over time. Here, the Alias Table is not efficient, as it needs to be re-built whenever the underlying distribution changes. In this paper, we engineer a simple data structure for maintaining discrete probability distributions in the dynamic setting. Construction of the data structure is possible in time $O(n)$, sampling is possible in expected time $O(1)$, and an update of size $\Delta$ can be processed in time $O(\Delta n / W)$. As a special case, we maintain an urn containing $W$ marbles of $n$ colors where with each update $O(W / n)$ marbles can be added or removed in $O(1)$ time per update. To evaluate the efficiency of the data structure in practice we conduct an empirical study. The results suggest that the dynamic sampling performance is competitive with the static Alias Table. Compared to existing more complex dynamic solutions we obtain a sampling speed-up of up to half an order of magnitude.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.