Accelerating Private Heavy Hitter Detection on Continual Observation Streams (2507.03361v1)
Abstract: Differentially private frequency estimation and heavy hitter detection are core problems in the private analysis of data streams. Two models are typically considered: the one-pass model, which outputs results only at the end of the stream, and the continual observation model, which requires releasing private summaries at every time step. While the one-pass model allows more efficient solutions, continual observation better reflects scenarios where timely and ongoing insights are critical. In the one-pass setting, sketches have proven to be an effective tool for differentially private frequency analysis, as they can be privatized by a single injection of calibrated noise. In contrast, existing methods in the continual observation model add fresh noise to the entire sketch at every step, incurring high computational costs. This challenge is particularly acute for heavy hitter detection, where current approaches often require querying every item in the universe at each step, resulting in untenable per-update costs for large domains. To overcome these limitations, we introduce a new differentially private sketching technique based on lazy updates, which perturbs and updates only a small, rotating part of the output sketch at each time step. This significantly reduces computational overhead while maintaining strong privacy and utility guarantees. In comparison to prior art, for frequency estimation, our method improves the update time by a factor of $O(w)$ for sketches of dimension $d \times w$; for heavy hitter detection, it reduces per-update complexity from $\Omega(|U|)$ to $O(d \log w)$, where $U$ is the input domain. Experiments show a increase in throughput by a factor of~$250$, making differential privacy more practical for real-time, continual observation, applications.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.