Privacy Amplification by Subsampling

Updated 4 October 2025

Privacy amplification by subsampling is a technique where using a random subset of data with a DP mechanism reduces overall privacy loss.
The method leverages mathematical formulations, such as log-form expressions and Rényi divergence, to provide tighter privacy guarantees.
It is applied in scalable learning frameworks like DP-SGD and federated learning to achieve high utility while maintaining strong privacy protections.

The privacy-amplification-by-subsampling technique is a foundational principle in differential privacy, articulating that running a differentially private mechanism on a random subsample of a population provides a significantly reduced privacy loss compared to running the mechanism on the entire dataset. This phenomenon has enabled the design of high-utility differentially private algorithms, particularly in iterative and large-scale data analysis settings, by leveraging randomness not only to mask individual data points but also to facilitate efficient composition and enhanced group privacy.

1. Formal Principle and Unified Divergence Framework

At the core, privacy amplification by subsampling capitalizes on the uncertainty in whether any individual's data instance is included in a randomly chosen subsample. Let $M$ denote a mechanism satisfying $(\varepsilon, \delta)$ -DP, and let $S$ denote a subsampling operator that selects a random subset (e.g., $m$ out of $n$ elements). If $M^S$ denotes the composed mechanism $M \circ S$ , then $M^S$ satisfies an improved DP guarantee:

$\varepsilon' = \log(1 + \eta (e^\varepsilon - 1)),\;\; \delta' = \eta \delta,$

where $\eta$ is the inclusion probability of a given data point (e.g., $m/n$ for uniform sampling without replacement). This log-form expression is the mathematical underpinning for the effective privacy improvement, and for small $\varepsilon$ simplifies to approximately $\varepsilon' \approx \eta \varepsilon$ .

Recent work generalizes these guarantees using the lens of Rényi differential privacy (RDP) and divergence-based analysis, notably via couplings and optimal transport (Balle et al., 2018). The advanced joint convexity property of $\alpha$ -divergences enables tight, mechanistic analysis across subsampling schemes (without replacement, with replacement, Poisson), under various neighboring relations (substitution, add/remove-one):

$e^{\varepsilon'} = 1 + \eta(e^{\varepsilon} - 1)$

where $\eta$ is the total variation distance between subsamples drawn from neighboring datasets. For sampling with replacement, group privacy profiles are invoked, and for general mechanisms, the approach extends to mixtures of probability measures and their coupled divergences.

2. Mechanism-Agnostic and Mechanism-Specific Amplification Bounds

Traditional DP literature developed mechanism-agnostic subsampling bounds (e.g., log-sublinear in the sampling rate for $(\varepsilon, \delta)$ -DP), but recent research emphasizes mechanism-specific amplification, especially via RDP:

For a base mechanism $M$ satisfying $(\alpha, \varepsilon(\alpha))$ -RDP, subsampling with ratio $\gamma$ gives a compounded RDP parameter ( $\gamma = m/n$ ):

$\varepsilon'(\alpha) \leq \frac{1}{\alpha-1} \log \left[ 1 + \gamma^2 C(\alpha,2) \min\{4(e^{\varepsilon(2)}-1), \dots \} + \sum_{j=3}^{\alpha} \gamma^j C(\alpha, j) e^{(j-1)\varepsilon(j)} \dots \right]$

(Wang et al., 2018).

The framework has been extended to general mixtures and compositions: Given subsampling S and base mechanism B, the divergence of the composed mixture can be tightly upper-bounded using optimal couplings and joint convexity of divergence (Schuchardt et al., 7 Mar 2024):

$\Psi_\alpha(m_x \| m_{x'}) \leq \int_{Y \times Y} \Psi_\alpha(b_{y^{(1)}} \| b_{y^{(2)}}) d\Gamma((y^{(1)}, y^{(2)}))$

where $\Gamma$ is a coupling of the sampling distributions for $x$ and $x'$ , and $b_{y^{(i)}}$ denotes the conditional densities of B.

This leads to improvements over previous bounds, especially for group privacy, by constructing distance-compatible couplings and conditioning on events such as “how many sensitive elements are sampled,” which allows the amplification bound to reflect both the base mechanism's behavior and the precise sampling procedure.

3. Extensions: Group Privacy, Compositionality, and Structured Subsampling

The amplification effect extends naturally to group privacy: modifying $k$ elements jointly typically suffers a scaling of privacy loss with $k$ (in worst-case group privacy, linear). However, subsampling reduces the chance that all $k$ elements are included, and sophisticated coupling strengthens the bound (Schuchardt et al., 7 Mar 2024):

Group privacy amplification for a group of size $K$ reflects the exponentially vanishing probability that multiple modified elements are included together under the subsampling process, and the framework provides RDP guarantees that are tight—often substantially outperforming group-multiplied bounds.

Privacy amplification via subsampling is highly compatible with the composition property of RDP—mechanism-specific bounds for single steps compose additively in the divergence domain, allowing tight privacy accounting across multiple iterative steps. This is especially exploited in the moments accountant techniques for differentially private stochastic optimization (Wang et al., 2018, Steinke, 2022), where the per-iteration privacy loss is “amplified” and the total composed loss is minimized.

With structured/conditional subsampling, amplification effects can be further tuned, as when restricting records to participate only in a fixed subset of epochs (or model partitioning, or partitioned parameter updating in federated learning) (Dong et al., 4 Mar 2025). In time series forecasting, "structured subsampling" is implemented by first sampling time series, then contiguous windows, then partitioning into context/forecast, each step adding to the privacy amplification effect (Schuchardt et al., 4 Feb 2025). For random allocation, where a record is included exactly in $k$ out of $t$ rounds, the effective privacy loss closely approximates that under Poisson subsampling with rate $k/t$ (Feldman et al., 12 Feb 2025).

4. Algorithmic and Statistical Implications

The amplification-by-subsampling principle is now a central design tool in scalable differentially private learning:

Subsampling Modality	Amplified Privacy Parameter $\varepsilon'$	Notes on Application Domain
Uniform without replacement	$\log(1 + (m/n)(e^\varepsilon - 1))$	Canonical for DP-SGD minibatching
Poisson	Similar log-form in $\gamma$ ( $=$ sampling rate)	Used in moments accountant, streaming
Group privacy (size $k$ )	Reduced exponentially in $k$ under optimal coupling	E.g., analysis for shuffling/clustering
Personalized/importance	$\log(1 + q(x)(e^{\varepsilon(w,x)} - 1))$ , $w = 1/q(x)$	Enables individualized privacy-utility

In statistical privacy ("average-case" or entropy-based privacy), subsampling enhances privacy as much as the rate at which inclusion of the sensitive record is diluted, and the amplification curve as a function of the privacy parameter $\varepsilon$ is governed by a logarithmic relationship (Breutigam et al., 15 Apr 2025):

$\epsilon' = \log(1 + (m/n)(e^{\epsilon} - 1))$

This framework allows explicit guidance on when subsampling yields net privacy gains beyond what DP would formally provide in worst-case adversary assumptions.

5. Practical Design Guidance and Empirical Observations

A direct outcome of the amplification analysis is that stochastic optimization routines (e.g., DP-SGD) can use much smaller noise scales per iteration by exploiting the much smaller effective privacy cost per minibatch, enabling high-utility learning under realistic privacy budgets (Wang et al., 2018, Steinke, 2022).

Moreover, when using importance or stratified sampling (including coreset construction), the subsampling distribution can be tuned for optimal trade-off between privacy, computational efficiency, and utility (Fay et al., 2023). In federated and distributed settings, random participation, model partitioning, or time uncertainty (as in random check-ins) can be viewed as specialized instances of structured subsampling yielding similar, and sometimes superior, amplification compared to standard shuffling or Poisson subsampling (Hasircioglu et al., 2022, Balle et al., 2020, Girgis et al., 2021).

Key observations from empirical studies include:

Amplification is strongest for small sampling rates (minibatch sizes much smaller than dataset size).
Structured or multistage subsampling (e.g., MUST) can outperform classical one-stage sampling, especially in computational efficiency and when only a fixed number of unique samples need to be evaluated (Zhao et al., 2023).
Applications to time series or sequential data, and extensions to quantum algorithms, exploit the same principle with appropriate analytical substitutions—typically by calculating the effective inclusion rate or average squared sensitivity for records/events (Schuchardt et al., 4 Feb 2025, Angrisani et al., 2022).

6. Limitations and Future Directions

While privacy amplification by subsampling is broadly effective, certain challenges remain:

It does not provide significant benefit for records always present in the sample (e.g., "must-include" datasets or deterministic batching),
Multistage subsampling amplifies $\varepsilon$ strongly but can have varied (and sometimes worse) effects on groupwise $\delta$ parameters, especially when sampling with replacement increases the multiplicity of records in the output,
Full benefit requires careful accounting of neighbor relations, personalized inclusion rates, and group event logic—as captured in the recent optimal transport-based analyses (Schuchardt et al., 7 Mar 2024).

Ongoing directions include adaptive partition strategies, further tightening the optimal bounds for non-i.i.d. or structured randomness in the sampling process, and extending these insights to quantum and average-case privacy frameworks.

In summary, privacy amplification by subsampling provides a mathematically rigorous and practically indispensable mechanism for improving the privacy-utility trade-off in differentially private algorithms. Its logic is now formalized not only for classical data and mechanisms but also for structured, hybrid, personalized, and even quantum settings, shaping the next generation of privacy-preserving learning systems.