Analytical Moments Accountant (AMA)
- AMA is a quantitative framework for precise privacy accounting in differentially private machine learning, leveraging moment generating functions and Rényi Differential Privacy.
- It employs moment-based composition and subsampling amplification to compute tight, data-independent upper bounds on cumulative privacy loss.
- Its efficient algorithmic implementation facilitates fast numerical evaluations and reduced privacy loss in practical DP-SGD systems.
The Analytical Moments Accountant (AMA) is a quantitative framework for precisely tracking the privacy loss in differentially private machine learning algorithms, particularly those based on stochastic gradient descent (SGD) with randomized mechanisms such as the Gaussian mechanism and incorporating dataset subsampling. AMA generalizes and extends the original Moments Accountant approach of Abadi et al. (2016) for the Gaussian mechanism to a broad class of mechanisms admitting Rényi Differential Privacy (RDP) guarantees, explicitly accounting for subsampling amplification effects. It enables practitioners to compute tight, data-independent upper bounds on cumulative privacy loss, translating these bounds into precise -differential privacy guarantees after arbitrary composition (Abadi et al., 2016, Wang et al., 2018).
1. Theoretical Foundations
AMA begins by formalizing the privacy-loss random variable for a randomized mechanism acting on two neighboring datasets , : The key quantitative tool is the log moment generating function (LMGF) or cumulant generating function (CGF) of the privacy-loss random variable,
where the supremum is taken over all auxiliary randomness and all adjacent dataset pairs. Within the RDP framework, the order- Rényi divergence between output distributions is tightly linked to the CGF: These moment (or cumulant) representations allow precise analysis under composition and subsampling, producing sharp upper bounds on privacy loss.
2. Moment-Based Composition and Subsampling
The composability of CGFs enables AMA to handle the adaptive composition of mechanisms: This moment-based analysis is valid even for adaptive composition (where each step may depend on previous outputs). Subsampling—a probabilistic selection of a data subset before mechanism application—provides privacy amplification. For a mechanism with RDP guarantee , when subsampling a fraction without replacement, the RDP of the composed mechanism is: for Poisson subsampling, with further generalizations for arbitrary mechanisms (Wang et al., 2018). This result provides the main amplification theorem, translating per-step RDP guarantees to overall amplified privacy protection under repeated mechanisms with subsampling.
3. Conversion to -Differential Privacy
The translation from moment/CDF-based guarantees to -differential privacy is achieved through Markov’s inequality and standard dual forms:
- Given target , the smallest achievable satisfies
- Given target , the minimal achievable is
The single-variable convex optimization over (or equivalently over the RDP order ) can be efficiently solved using bisection; convexity and monotonicity properties guarantee correctness and fast convergence. In practical settings, is discretized on a computational grid with optional piecewise-linear interpolation to cover fractional orders (Abadi et al., 2016, Wang et al., 2018).
4. Algorithmic Implementation and Data Structures
AMA is implemented as a data structure maintaining a dictionary of mechanisms, each paired with its count of applications and CGF. Each time a mechanism (with or without subsampling) is applied, the CGF is computed (or cached) and incremented. The pseudocode is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
DataStructure AMA:
initialize empty dictionary D
function add_mechanism(M, count=1):
if D contains M:
D[M].count += count
else:
D[M] := (count, K_M)
function add_subsampled_mechanism(M, gamma, count=1):
compute ε_sub(α) for M∘subsample via amplification theorem
define K_sub(λ) = λ·ε_sub(λ+1)
add_mechanism(M_sub, count) # anonymous_mech_with_CGF=K_sub
function total_CGF(λ):
return sum(count * K(λ) for (count, K) in D.values())
function get_epsilon(delta):
bisection_solve(lambda λ: (total_CGF(λ)+log(1/delta))/λ)
function get_delta(epsilon):
bisection_solve(lambda λ: exp(total_CGF(λ)-λ*epsilon)) |
Each mechanism increment is constant time; each CGF evaluation is linear in the number of distinct mechanisms.
5. Closed-Form and Numerical Evaluation
For certain mechanisms, such as Poisson-subsampled Gaussian, AMA enables closed-form or numerically tractable computation of log-moments: with the involving expectations over Gaussian densities mixed according to the inclusion probability. For small sampling rates and large noise, an asymptotic bound applies: Numerical integration may be required for intermediate parameter regimes. In practical systems, precomputed tables and efficient log-sum-exp evaluations are used to ensure stability and tractability (Abadi et al., 2016).
6. Practical Performance, Tradeoffs, and Recommendations
Computational cost for maintaining the AMA is modest: per-step overhead in standard DP-SGD implementations (e.g., TensorFlow) is typically 10–30% compared to standard SGD. Best practice calls for:
- Discretizing lambda () on a grid (e.g., 1 to 32 in (Abadi et al., 2016), up to several hundred or thousand in (Wang et al., 2018)), with interpolation as required.
- Numerical stability via log-sum-exp and careful floating-point operations, especially for large CGFs.
- Storing a dictionary of distinct mechanisms and their counts for efficient moment accounting.
- Fixing a bisection tolerance for invertible privacy cost queries (e.g., ).
Empirical studies illustrate that AMA yields strictly tighter privacy accounting relative to classical composition (naive addition or strong composition [Kairouz–Oh–Viswanath]), especially at high composition counts and in the high-privacy regime (small ). This results in order-of-magnitude reductions in overall privacy loss for commonly used mechanisms, as demonstrated for subsampled Gaussian, Laplace, and randomized response mechanisms (Wang et al., 2018).
7. Significance and Impact
AMA provides a rigorous, general, and data-independent tool for privacy accounting in large-scale private machine learning. It is compatible with arbitrary RDP mechanisms and subsampling schemes, enabling precise per-instance privacy cost conversion and optimal parameter selection. AMA represents a major advance over advanced composition approaches, achieving tighter privacy loss and directly supporting the design and auditing of deep learning systems for privacy preservation (Abadi et al., 2016, Wang et al., 2018).