Moments Accountant in Differential Privacy
- Moments accountant is a framework that rigorously tracks cumulative privacy loss in adaptive differential privacy mechanisms using log-moment generating functions.
- It underpins DP-SGD by enabling tighter privacy accounting than classical composition, leading to improved privacy-utility tradeoffs in machine learning.
- It integrates analytical and numerical optimizations, with refinements via Rényi differential privacy and f-divergences for practical, scalable implementations.
A moments accountant is a framework for tightly tracking and composing the cumulative privacy loss in mechanisms satisfying differential privacy (DP), specifically designed for the analysis of adaptive compositions in differentially private machine learning. It allows for precise quantification of -DP guarantees by maintaining the log-moment generating function (MGF) of the privacy-loss random variable, rather than relying on classical, looser composition theorems. Moments accountant techniques are now foundational in the practice of private stochastic gradient descent (DP-SGD) and are continually refined through advances in Rényi differential privacy (RDP) and information-theoretic optimization over -divergences.
1. Formal Definition and Theoretical Foundations
Let be a randomized mechanism with input and output in some range . For adjacent datasets , and any auxiliary input, define the privacy-loss random variable for outcome :
The th log-moment of the privacy-loss random variable is:
The moments accountant records the worst-case log-moment across all auxiliary inputs and neighboring datasets:
For adaptive compositions , the log-moments are additive:
The key mechanism for extracting -DP guarantees is the tail bound:
This approach subsumes and improves on basic and advanced composition theorems, ensuring the tightest possible privacy accounting for a given sequence of private operations (Abadi et al., 2016).
2. Relation to Rényi Differential Privacy and -divergences
The moments accountant framework generalizes naturally to Rényi Differential Privacy (RDP), defined as follows: a mechanism satisfies -RDP for if for adjacent :
where
Let denote the cumulant generating function (CGF) of the privacy-loss random variable:
There is a direct equivalence:
RDP facilitates straightforward composition (CGFs add), and the translation back to -DP is performed via bisection/minimization over as above (Wang et al., 2018, Abadi et al., 2016).
Recent advances relate these accounting schemes to information-theoretic quantities, in particular Csiszár’s -divergences. For example, the hockey-stick (or ) divergence and the -divergence play key roles:
- -DP is characterized by bounds on .
- -RDP is characterized by bounds on , equivalently (Asoodeh et al., 2020).
3. Algorithmic Implementation and Analytical Moments Accountant (AMA)
The Analytical Moments Accountant (AMA) maintains and updates the CGFs for all contributing mechanisms, supporting arbitrary adaptive and subsampled compositions. The essential steps are:
- Track for each mechanism a symbolic or oracle CGF .
- For data subsampling (e.g., SGD with minibatching), upper-bound the RDP parameter of the subsampled mechanism using the tight amplification theorem:
where depends on the RDP constants .
- In AMA, composition is performed by adding CGFs:
- -DP guarantees are recovered by minimizing over as described above.
Optimized implementations exploit log-domain operations (log-sum-exp, geometric tail truncation), ensure numerical convexity, and support time per query (Wang et al., 2018).
4. Practical Applications in Differentially Private Learning
Moments accountant techniques, and their generalizations (including AMA), are central to the privacy analysis of private stochastic gradient descent (DP-SGD):
- At each iteration, the per-step privacy cost is quantified as a log-moment (in practice, using a precomputed grid of values).
- After iterations, the total log-moment is , with privacy guarantees extracted by minimizing over .
- Empirically, moments accountant bounds permit an order-of-magnitude more training steps or sharper privacy–utility tradeoffs compared to classical composition. For example: with , , and , the moments accountant yields compared to for strong composition (Abadi et al., 2016).
Comprehensive empirical studies on MNIST and CIFAR-10 validate these improvements for deep learning with non-convex objectives, showing competitive test accuracy under modest privacy budgets.
5. Recent Refinements: Optimal -divergence Conversion and Tighter Bounds
Asoodeh et al. (Asoodeh et al., 2020) present an information-theoretically optimal extension of the moments accountant, leveraging the joint range of -divergences for tighter conversion from RDP to -DP:
- The minimal such that an -RDP mechanism is -DP is
- The refined moments accountant bound replaces the classical loose Markov-based conversion with
where is the number of SGD steps, and encodes the per-step RDP guarantee.
- For fixed privacy budget, refined bounds allow up to 100 extra iterations for DP-SGD under standard settings, representing a substantial utility gain.
The classical conversion is a one-sided Chernoff/Markov bound, whereas the refined analysis is information-theoretically tight, optimizing over the entire joint range of the relevant -divergences.
6. Comparative Empirical Performance and Practical Impact
Empirical evaluation under fixed privacy budgets demonstrates that the moments accountant enables more accurate, longer training regimes:
| Method | Maximum SGD Steps for |
|---|---|
| Classical Accountant | 4200 |
| Refined Bound (Asoodeh et al., 2020) | 4305 |
For MNIST (60k examples) and CIFAR-10 (50k examples), experiments confirm that the moments accountant allows for high test accuracy even at small , with hyperparameters such as batch size and noise level influencing the privacy–utility tradeoff (Abadi et al., 2016).
7. Algorithmic and Numerical Considerations
Efficient implementation of the moments accountant, especially in the context of diverse or subsampled mechanisms, involves:
- Maintaining CGFs in closed or symbolic form, or as an oracle interface.
- Performing optimizations to extract privacy guarantees via (quasi-)convex minimization over / using bisection, exploiting the convexity and monotonicity properties of the CGFs.
- Implementing log-domain arithmetic to prevent overflow or loss of precision.
- Where possible, leveraging geometric truncation and log-sum-exp approximations for scalable evaluation with large or step counts.
Pseudocode frameworks for both vanilla and analytical moments accountant workflows are detailed in (Abadi et al., 2016, Wang et al., 2018), supporting practical integration into private machine learning workflows.
References:
- "Deep Learning with Differential Privacy" (Abadi et al., 2016)
- "Subsampled Rényi Differential Privacy and Analytical Moments Accountant" (Wang et al., 2018)
- "A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via -Divergences" (Asoodeh et al., 2020)