Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 73 tok/s Pro
Kimi K2 231 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Randomness Complexity in Differential Privacy

Updated 26 October 2025
  • Randomness complexity in differential privacy is defined as the minimum amount of internal randomness required to guarantee privacy and accuracy in counting query releases.
  • The topic explores techniques like randomized shifting that compress the randomness needed for multiple queries to O(log d), balancing privacy with computational efficiency.
  • It highlights practical implications for large-scale deployments, where sharing randomness across queries significantly reduces resource demands while maintaining strict DP guarantees.

Randomness complexity in differential privacy quantifies the minimum amount of internal randomness required by privacy-preserving mechanisms to achieve their rigorous stability guarantees. In the context of classical differential privacy (DP), randomness is integral for adding calibrated noise to dataset queries, masking the influence of single records and ensuring that an adversary cannot distinguish neighboring datasets. The paper of randomness complexity investigates the least possible random bits necessary for accurate and private outputs, the trade-offs when answering many queries, and how the structure of the mechanism impacts the amount of noise—and therefore randomness—that must be generated. Recent work, particularly in the context of counting queries, shows that naively adding independent noise to each query is not randomness-optimal, and considerable savings are possible when randomness is carefully managed and shared across outputs.

1. Fundamental Concepts: Differential Privacy and Randomness

Differential privacy (DP) is satisfied by a randomized mechanism M\mathcal{M} if for all neighboring datasets D,DD, D' and all measurable output subsets O\mathcal{O},

P[M(D)O]eεP[M(D)O]+δP[\mathcal{M}(D) \in \mathcal{O}] \leq e^\varepsilon \cdot P[\mathcal{M}(D') \in \mathcal{O}] + \delta

where ε\varepsilon is the privacy parameter and δ\delta is a small failure probability. The randomness complexity of a DP mechanism refers to the minimum expected or worst-case number of random bits required to ensure these guarantees, subject to a specified accuracy bound for the released statistics. For one-way counting queries, this problem takes a concrete form: output the number of database entries satisfying each predicate in a list P1,,Pd\mathcal{P}_1, \ldots, \mathcal{P}_d under (ε,δ)(\varepsilon, \delta)-DP, while minimizing the random bit usage for a given (additive) error.

Historically, large-scale DP deployments—such as the 2020 U.S. Census under the Disclosure Avoidance System—used tens of terabytes of randomness due to the sheer volume of queries, motivating the formal paper of derandomization and randomness-efficient DP mechanisms (Garfinkel et al., 2020).

2. Lower Bounds and the Need for Randomness in DP Query Release

For a single counting query, it is known that any ε\varepsilon-differentially private, reasonably accurate mechanism requires nearly one bit of true randomness per answer; that is, one cannot achieve privacy by deterministic means [CSV25]. This lower bound is essentially tight: to perturb the query sufficiently so that an adversary cannot pinpoint the participation of a single individual, the randomized output must have enough entropy, making the random bit requirement inescapable.

However, when many queries are answered—especially in batch—statistical correlations and the structure of noise injection enable stronger derandomization. In particular, [CSV25] shows that although dd independent queries appear to need dd random bits, the total randomness required can be compressed to O(logd)O(\log d) bits while still meeting the same accuracy and privacy targets, due to the ability to share randomness across coordinates without increasing the risk of privacy loss.

3. Classical Versus Efficient Derandomization Schemes

Earlier derandomization results for dd counting queries (e.g., [CSV25]) are based on rounding schemes—a class of combinatorial objects that partition the range of possible outputs in a noise-efficient manner. These mechanisms achieve near-optimal randomness complexity, with the expected number of bits

R0(M)dlog2(+1)+log2(1/δ)+O(1)R_0(\mathcal{M}) \leq \left\lceil \frac{d}{\ell} \right\rceil \log_2(\ell + 1) + \log_2(1/\delta) + O(1)

for suitably chosen integer \ell: taking =d\ell = d yields O(logd)O(\log d) random bits. However, such constructions are not known to be efficiently computable; the existence proofs for appropriate rounding schemes are nonconstructive and often useless in practice.

The new mechanism introduced in (Ghentiyala, 19 Oct 2025) addresses these limitations by providing a polynomial-time implementable derandomization for dd counting queries—eschewing rounding schemes for an approach that is both more intuitive and computationally feasible.

4. Polynomial-Time Randomness-Efficient Mechanism: Randomized Shifting and Selective Noise

The key technical innovation is the use of a randomized shift, ω\omega, applied uniformly to all coordinates before adding noise and discretizing the result. Specifically, for input counts vZdv \in \mathbb{Z}^d and independently drawn noise ηi\eta_i (e.g., discrete Gaussian), the mechanism proceeds as follows:

  • A common random shift ω\omega is generated using O(logs)O(\log s) bits, with ω\omega uniformly chosen from {r,2r,,sr}\{r, 2r, \ldots, sr\} for parameters r,sr,s.
  • The output is computed as y=v+(ω,,ω)+ηrsy = \lfloor v + (\omega, \ldots, \omega) + \eta \rfloor_{rs}, where rs\lfloor \cdot \rfloor_{rs} rounds each coordinate to the nearest multiple of rsrs.

A central insight is that for most coordinates, the effect of ηi\eta_i is absorbed by the rounding, so that the outcome for those coordinates is insensitive to the precise noise realization—hence, new random bits for ηi\eta_i are only required in a small fraction of cases. For each coordinate, with probability at least $1 - 2/s$, the output does not depend on the individual noise sample for that coordinate, so the expected number of coordinates needing fresh randomness is $2d/s$. Choosing ss large (e.g., sdpolylog(d/ε)s \asymp d \cdot \mathrm{polylog}(d/\varepsilon)) yields total expected random bits O(logd)O(\log d).

Randomness-Accuracy Trade-off: As ss increases, randomness decreases, but accuracy drops due to coarser rounding. The trade-off is explicit: α=O(dlndln(1/δ)s/ε)\alpha = O\left( \sqrt{d \ln d \ln(1/\delta) s}/\varepsilon \right) for approximate (ε,δ)(\varepsilon, \delta)-DP, and

α=O(dlog(d/β)ε+slnsdln(d/ε)ε)\alpha = O\left( \frac{d \log(d/\beta)}{\varepsilon} + \frac{s \ln s \cdot d \ln(d/\varepsilon)}{\varepsilon} \right)

for pure DP.

5. Quantitative Bounds and Mathematical Formulations

With precise parameter setting, the mechanism guarantees (ε,δ)(\varepsilon, \delta)-DP and per-query accuracy,

R(M)O(dpolylog(d/ε)+polyloglog(1/δ)s+logs),R(\mathcal{M}) \leq O\left( \frac{d \cdot \mathrm{polylog}(d/\varepsilon) + \mathrm{polylog} \log(1/\delta)}{s} + \log s \right),

while the error α\alpha scales as above. In the limit s=dpolylog(d/ε)s = d \cdot \mathrm{polylog}(d/\varepsilon), R(M)=O(logd)R(\mathcal{M}) = O(\log d) random bits suffice for all dd queries with only a modest loss in accuracy.

6. Implications, Practicality, and Future Directions

This result has immediate ramifications for large-scale, real-world deployments where cryptographically secure randomness is resource-constrained. It demonstrates that batching queries, together with common random shifts and selective noise, can exponentially compress the randomness complexity necessary for private, accurate release of statistics. The analysis also clarifies why randomness savings arise: when many queries are aggregated, much of the per-query randomness is redundant, and can be shared or omitted without weakening privacy, as long as the mechanism's output discretization absorbs ambiguities caused by noise.

Compared to previous, combinatorial approaches, this mechanism is conceptually and computationally simpler—making it attractive for further extensions to more complex analyses.

Research directions include generalizing these batch-derandomization methods to broader query families, establishing hardness results for randomness-optimal mechanisms outside counting queries, and exploring the interplay between randomness complexity, computational efficiency, and privacy for interactive or adaptive query answering.

7. Summary Table: Randomness Complexity for Counting Queries

Query Setting Random Bits Required Mechanism
1 Counting Query 1\approx 1 Standard DP mechanisms (Laplace/Discrete Gaussian)
dd Counting Queries O(logd)O(\log d) [CSV25] rounding schemes or efficient mechanism (Ghentiyala, 19 Oct 2025)

The polynomial-time mechanism of (Ghentiyala, 19 Oct 2025) achieves near-optimal randomness savings for private, accurate release of dd counting queries, providing a practical solution to the scalability bottlenecks encountered in high-dimensional DP deployments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Randomness Complexity of Differential Privacy.