Streaming L_p-Sampling Algorithms

Updated 16 January 2026

Streaming L_p-sampling algorithms are one-pass, space-efficient protocols that sample vector indices with probability proportional to |x_i|^p.
They employ randomized scaling, CountSketch recovery, and norm estimation to balance error rates and update times across various p regimes.
These methods enable practical applications like heavy hitters, duplicates detection, distributed monitoring, and regression coresets in data streams.

A streaming $L_p$ -sampling algorithm is a one-pass, small-space protocol that, given turnstile or insertion-only updates to a vector $x \in \mathbb{R}^n$ , returns index $i$ with probability exactly $|x_i|^p/\|x\|_p^p$ , or with specified relative error, while failing rarely. These primitives are central to randomized tracking of "frequency moments" in streaming models, supporting tasks such as heavy hitters, duplicates, distributed monitoring, and online learning. The field has developed tight space, error, and time bounds for all parameter regimes of $p$ , leveraging probabilistic scaling, sparse recovery, linear sketches, precise order-statistics, and sampling/rejection frameworks.

1. Formal Problem Statement and Distributional Guarantees

Given an underlying vector $x \in \mathbb{R}^n$ in a turnstile or insertion-only data stream, the $L_p$ -distribution on $[n]$ is

$\Pr[i] = \frac{|x_i|^p}{\|x\|_p^p}, \quad \|x\|_p = \left( \sum_{i=1}^n |x_i|^p \right)^{1/p}.$

An $L_p$ -sampling algorithm outputs $i$ so that, conditioned on success,

$\Pr[i] = (1\pm\varepsilon)\frac{|x_i|^p}{\|x\|_p^p} + O(n^{-c})$

for any $c>0$ , failure probability at most $\delta$ (Jowhari et al., 2010). For $p=0$ , the algorithm samples uniformly from $\{i : x_i \neq 0\}$ . Algorithms may be required to be "perfect" (exactly matching $L_p$ distribution up to $1/\mathrm{poly}(n)$ additive error), or "approximate" (allowing small multiplicative error).

2. Algorithmic Frameworks for $0 < p < 2$

Space-optimal $L_p$ -sampling for $0 < p < 2$ is achieved via randomized scaling and heavy-hitter detection, typically combining:

Exponentially scaled streams: Each $x_i$ is divided by $t_i^{1/p}$ , with $t_i \sim \mathrm{Exp}(1)$ (Jowhari et al., 2010, Jayaram et al., 2018). The argmax is returned; by stability of exponentials, the distribution is exactly $|x_i|^p/\|x\|_p^p$ .
CountSketch heavy-hitter recovery: Linear sketches (CountSketch) recover the heavy coordinate efficiently (Jowhari et al., 2010).
Norm estimation: Additional small-space sketches (e.g., AMS) yield rough $\ell_p$ -norm estimates (Lin et al., 9 Aug 2025).
Multi-level statistical tests: Multiple independent trials and gap tests amplify correctness (Swartworth et al., 29 Nov 2025, Lin et al., 9 Aug 2025).

A canonical method initializes random scaling parameters, runs CountSketches and norm estimators for recovery, and employs a statistical test to avoid ambiguous outputs. For $0 < p < 2$, optimal space is $O(\log^2 n \log(1/\delta))$ bits, perfect sampling is achievable in $\mathrm{polylog}(n)$ update time (Swartworth et al., 29 Nov 2025).

$p$ regime	Space (bits)	Update time	Reference
$p \in (1,2)$	$O(\varepsilon^{-p} \log^2 n \log(1/\delta))$	$\mathrm{polylog}(n)$	(Jowhari et al., 2010, Swartworth et al., 29 Nov 2025)
$p \in (0,1)$	$O(\varepsilon^{-1} \log^2 n \log(1/\delta))$	$\mathrm{polylog}(n)$	(Jowhari et al., 2010)
$p = 1$	$O(\varepsilon^{-1} \log(1/\varepsilon) \log^2 n \log(1/\delta))$	$\mathrm{polylog}(n)$	(Jowhari et al., 2010)
$p = 0$	$O(\log^2 n \log(1/\delta))$	$\mathrm{polylog}(n)$	(Jowhari et al., 2010)

For $p=2$ , a logarithmic factor in space is currently unavoidable under present techniques (Swartworth et al., 29 Nov 2025).

3. Extensions: $p > 2$ and General $G$ -Sampling

For $p > 2$ , perfect $L_p$ sampling requires fundamentally more space. Recent advances employ a sampling-and-rejection method:

Instantiate $N = \Theta(n^{1-2/p})$ independent perfect $L_2$ samplers.
For each $L_2$ sample, estimate $x_j^{p-2}$ via CountSketch and approximate rejection step.
Accept with probability proportional to $|x_j|^p/(N\|x\|_p^p)$ (Woodruff et al., 9 Apr 2025).

Generalizations to non–scale-invariant $G$ functions use similar sampling and rejection logic, extending to polynomials, logarithms, or capped powers, with perfect sampling in $\tilde O(n^{1-2/p})$ bits for polynomials and $O(\log^2 n)$ bits for $G(z) = \log(1 + |z|)$ and $G(z) = \min(T, |z|^p)$ (Woodruff et al., 9 Apr 2025).

4. Derandomization, Update Complexity, and Lower Bounds

Derandomization: PRGs such as Nisan’s and Gopalan–Kane–Meka enable deterministic sampling with negligible increase in space (Jayaram et al., 2018, Swartworth et al., 29 Nov 2025).
Update time: Recent algorithms attain $\mathrm{polylog}(n)$ per-update cost for $0 < p < 2$ by simulating exponentials and their rescalings efficiently, using Fourier inversion and rapid quadrature (Swartworth et al., 29 Nov 2025).
Lower bounds: One-pass streaming algorithms for $L_p$ sampling on $\{0,\pm1\}$ vectors require $\Omega(\log^2 n)$ bits, matching optimal algorithms. Heavy hitters and duplicates detection admit matching lower bounds $O(\phi^{-p} \log^2 n)$ bits (Jowhari et al., 2010).
Truly perfect sampling: In the general turnstile model, achieving truly perfect sampling ( $\epsilon = \gamma = 0$ ) requires $\Omega(n)$ space, but insertion-only or sliding-window models allow $O(n^{1-1/p} \mathrm{polylog} n)$ bits for $p \ge 1$ (Jayaram et al., 2021).

5. Practical Implications and Continuous Sampling

In practical data streaming, $L_p$ -sampling algorithms enable:

Duplicates detection: $L_1$ sampling gives $O(\log^2 n)$ bits for one-pass detection, improving previous bounds.
Heavy hitters: Matching $O(\phi^{-p} \log^2 n)$ upper and lower bounds in turnstile models.
Continuous sampling: Algorithms maintain a valid sample at each time, supporting applications in distributed monitoring and online statistics (Lin et al., 9 Aug 2025).
Regression and coresets: In turnstile matrix streams, $\ell_p$ leverage score and row sampling enables streaming coreset constructions for regression and neural learning tasks (Munteanu et al., 2024).
Distributed monitoring: Perfect $L_p$ sampling over $k$ servers is resolved for all $p \ge 1$ , with optimal $O(k^{p-1} \mathrm{polylog}(n))$ communication (Lin et al., 26 Oct 2025).

6. Open Questions and Future Directions

Major unresolved issues include:

Eliminating the remaining $\log n$ factor for $p=2$ .
Achieving truly perfect sampling (zero additive error) in general turnstile or adversarial streams without linear space (Jayaram et al., 2021).
Reducing the polylogarithmic dependence on $\varepsilon$ in approximate sampling.
Extensions to more general non-linear update patterns, adaptive data streams, and privacy settings (Lin et al., 9 Aug 2025, Jayaram et al., 2021).

7. References and Historical Context

Initial upper bounds were given by Monemizadeh and Woodruff (2010) for approximate $L_p$ sampling (Jowhari et al., 2010). Subsequent work by Andoni, Krauthgamer, and Onak improved the space-time complexity (Jowhari et al., 2010). The first perfect $L_p$ samplers for $p \in (0,2)$ were constructed by Jayaram and Woodruff, with later derandomization and efficient update-time results (Jayaram et al., 2018, Swartworth et al., 29 Nov 2025). For $p > 2$ , Woodruff, Xie, and Zhou established new tight bounds for perfect and polynomial samplers (Woodruff et al., 9 Apr 2025). Practical frameworks for regression were developed with $\ell_p$ leverage-score sampling in turnstile matrix models, yielding the first streaming coresets for logistic regression (Munteanu et al., 2024). Distributed monitoring with adversarial robustness is now solved up to log factors for all $p \ge 1$ (Lin et al., 26 Oct 2025).

Streaming $L_p$ -sampling has been central in closing the gap between theoretical lower bounds and real-time data analytics, delivering near-optimal algorithms for a spectrum of streaming and monitoring settings.