Moments Accountant in Differential Privacy

Updated 24 December 2025

Moments accountant is a framework that rigorously tracks cumulative privacy loss in adaptive differential privacy mechanisms using log-moment generating functions.
It underpins DP-SGD by enabling tighter privacy accounting than classical composition, leading to improved privacy-utility tradeoffs in machine learning.
It integrates analytical and numerical optimizations, with refinements via Rényi differential privacy and f-divergences for practical, scalable implementations.

A moments accountant is a framework for tightly tracking and composing the cumulative privacy loss in mechanisms satisfying differential privacy (DP), specifically designed for the analysis of adaptive compositions in differentially private machine learning. It allows for precise quantification of $(\varepsilon, \delta)$ -DP guarantees by maintaining the log-moment generating function (MGF) of the privacy-loss random variable, rather than relying on classical, looser composition theorems. Moments accountant techniques are now foundational in the practice of private stochastic gradient descent (DP-SGD) and are continually refined through advances in Rényi differential privacy (RDP) and information-theoretic optimization over $f$ -divergences.

1. Formal Definition and Theoretical Foundations

Let $M$ be a randomized mechanism with input $d \in D^n$ and output in some range $\mathcal{R}$ . For adjacent datasets $d, d'$ , and any auxiliary input, define the privacy-loss random variable for outcome $o \in \mathcal{R}$ :

$c(o; M, d, d') = \ln \left( \frac{\Pr[M(d) = o]}{\Pr[M(d') = o]} \right)$

The $\lambda$ th log-moment of the privacy-loss random variable is:

$\alpha_M(\lambda; d, d') = \ln \mathbb{E}_{o \sim M(d)} \left[ \exp \left( \lambda c(o; M, d, d') \right) \right]$

The moments accountant records the worst-case log-moment across all auxiliary inputs and neighboring datasets:

$\alpha_M(\lambda) = \max_{d \sim d'} \; \alpha_M(\lambda; d, d')$

For adaptive compositions $M = M_1 \circ M_2 \circ \cdots \circ M_k$ , the log-moments are additive:

$\alpha_M(\lambda) \le \sum_{i=1}^k \alpha_{M_i}(\lambda)$

The key mechanism for extracting $(\varepsilon, \delta)$ -DP guarantees is the tail bound:

$\delta = \inf_{\lambda > 0} \exp[\alpha_M(\lambda) - \lambda \varepsilon]$

This approach subsumes and improves on basic and advanced composition theorems, ensuring the tightest possible privacy accounting for a given sequence of private operations (Abadi et al., 2016).

2. Relation to Rényi Differential Privacy and $f$ -divergences

The moments accountant framework generalizes naturally to Rényi Differential Privacy (RDP), defined as follows: a mechanism $M$ satisfies $(\alpha, \epsilon)$ -RDP for $\alpha > 1$ if for adjacent $S, S'$ :

$D_{\alpha}(M(S) \parallel M(S')) \leq \epsilon$

where

$D_{\alpha}(P \parallel Q) = \frac{1}{\alpha-1} \log \mathbb{E}_{x \sim Q} \left[ \left( \frac{P(x)}{Q(x)} \right)^\alpha \right]$

Let $K(\lambda)$ denote the cumulant generating function (CGF) of the privacy-loss random variable:

$K(\lambda) = \log \mathbb{E}[e^{\lambda L}]$

There is a direct equivalence:

$M \text{ is } (\alpha, \epsilon)\text{-RDP} \iff K(\alpha-1) \leq (\alpha-1)\epsilon$

RDP facilitates straightforward composition (CGFs add), and the translation back to $(\varepsilon, \delta)$ -DP is performed via bisection/minimization over $\lambda$ as above (Wang et al., 2018, Abadi et al., 2016).

Recent advances relate these accounting schemes to information-theoretic quantities, in particular Csiszár’s $f$ -divergences. For example, the hockey-stick (or $E_\lambda$ ) divergence and the $\chi^\alpha$ -divergence play key roles:

$(\varepsilon, \delta)$ -DP is characterized by bounds on $E_{e^\varepsilon}$ .
$(\alpha, \gamma)$ -RDP is characterized by bounds on $D_\alpha$ , equivalently $\chi^\alpha$ (Asoodeh et al., 2020).

3. Algorithmic Implementation and Analytical Moments Accountant (AMA)

The Analytical Moments Accountant (AMA) maintains and updates the CGFs for all contributing mechanisms, supporting arbitrary adaptive and subsampled compositions. The essential steps are:

Track for each mechanism $M_i$ a symbolic or oracle CGF $K_i(\lambda)$ .
For data subsampling (e.g., SGD with minibatching), upper-bound the RDP parameter of the subsampled mechanism using the tight amplification theorem:

$\epsilon_{\text{M} \circ \text{sub}}(\alpha) \leq \frac{1}{\alpha - 1} \log\left( 1 + \gamma^2 { \alpha \choose 2 } m_2 + \sum_{j=3}^\alpha \gamma^j { \alpha \choose j } m_j \right)$

where $m_j$ depends on the RDP constants $e^{\epsilon(j)}$ .

In AMA, composition is performed by adding CGFs:

$K_{\text{total}}(\lambda) \leftarrow K_{\text{total}}(\lambda) + K_i(\lambda)$

$(\varepsilon, \delta)$ -DP guarantees are recovered by minimizing over $\lambda$ as described above.

Optimized implementations exploit log-domain operations (log-sum-exp, geometric tail truncation), ensure numerical convexity, and support $O(\log \alpha)$ time per query (Wang et al., 2018).

4. Practical Applications in Differentially Private Learning

Moments accountant techniques, and their generalizations (including AMA), are central to the privacy analysis of private stochastic gradient descent (DP-SGD):

At each iteration, the per-step privacy cost is quantified as a log-moment (in practice, using a precomputed grid of $\lambda$ values).
After $T$ iterations, the total log-moment is $\alpha_T(\lambda) = \sum_{t=1}^T \alpha_{\text{step}}(\lambda)$ , with privacy guarantees extracted by minimizing $\exp(\alpha_T(\lambda) - \lambda \varepsilon)$ over $\lambda > 0$ .
Empirically, moments accountant bounds permit an order-of-magnitude more training steps or sharper privacy–utility tradeoffs compared to classical composition. For example: with $q = 0.01$ , $\sigma = 4$ , and $k = 10,000$ , the moments accountant yields $\varepsilon \approx 1.3$ compared to $\varepsilon \approx 9.3$ for strong composition (Abadi et al., 2016).

Comprehensive empirical studies on MNIST and CIFAR-10 validate these improvements for deep learning with non-convex objectives, showing competitive test accuracy under modest privacy budgets.

Asoodeh et al. (Asoodeh et al., 2020) present an information-theoretically optimal extension of the moments accountant, leveraging the joint range of $f$ -divergences for tighter conversion from RDP to $(\varepsilon,\delta)$ -DP:

The minimal $\delta$ such that an $(\alpha, \gamma)$ -RDP mechanism is $(\varepsilon, \delta)$ -DP is

$\delta^{*}_{\alpha,\varepsilon}(\gamma) = \sup\{ E_\lambda(P\|Q) \; : \; \chi^\alpha(P\|Q) \leq \chi(\gamma), \lambda = e^\varepsilon \}$

The refined moments accountant bound replaces the classical loose Markov-based conversion with

$\delta(\varepsilon) = \inf_{\alpha > 1} \delta^*_{\alpha, \varepsilon}(T \gamma(\alpha))$

where $T$ is the number of SGD steps, and $\gamma(\alpha)$ encodes the per-step RDP guarantee.

For fixed privacy budget, refined bounds allow up to 100 extra iterations for DP-SGD under standard settings, representing a substantial utility gain.

The classical conversion is a one-sided Chernoff/Markov bound, whereas the refined analysis is information-theoretically tight, optimizing over the entire joint range of the relevant $f$ -divergences.

6. Comparative Empirical Performance and Practical Impact

Empirical evaluation under fixed privacy budgets demonstrates that the moments accountant enables more accurate, longer training regimes:

Method	Maximum SGD Steps for $\varepsilon=1$
Classical Accountant	4200
Refined Bound (Asoodeh et al., 2020)	4305

For MNIST (60k examples) and CIFAR-10 (50k examples), experiments confirm that the moments accountant allows for high test accuracy even at small $\varepsilon$ , with hyperparameters such as batch size and noise level $\sigma$ influencing the privacy–utility tradeoff (Abadi et al., 2016).

7. Algorithmic and Numerical Considerations

Efficient implementation of the moments accountant, especially in the context of diverse or subsampled mechanisms, involves:

Maintaining CGFs in closed or symbolic form, or as an oracle interface.
Performing optimizations to extract privacy guarantees via (quasi-)convex minimization over $\lambda$ / $\alpha$ using bisection, exploiting the convexity and monotonicity properties of the CGFs.
Implementing log-domain arithmetic to prevent overflow or loss of precision.
Where possible, leveraging geometric truncation and log-sum-exp approximations for scalable evaluation with large $\alpha$ or step counts.

Pseudocode frameworks for both vanilla and analytical moments accountant workflows are detailed in (Abadi et al., 2016, Wang et al., 2018), supporting practical integration into private machine learning workflows.

References:

"Deep Learning with Differential Privacy" (Abadi et al., 2016)
"Subsampled Rényi Differential Privacy and Analytical Moments Accountant" (Wang et al., 2018)
"A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$ -Divergences" (Asoodeh et al., 2020)

Markdown Report Issue Upgrade to Chat

References (3)

Deep Learning with Differential Privacy (2016)

Subsampled Rényi Differential Privacy and Analytical Moments Accountant (2018)

A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$-Divergences (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Moments Accountant.

Moments Accountant in Differential Privacy

1. Formal Definition and Theoretical Foundations

2. Relation to Rényi Differential Privacy and $f$ -divergences

3. Algorithmic Implementation and Analytical Moments Accountant (AMA)

4. Practical Applications in Differentially Private Learning

6. Comparative Empirical Performance and Practical Impact

7. Algorithmic and Numerical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Moments Accountant in Differential Privacy

1. Formal Definition and Theoretical Foundations

2. Relation to Rényi Differential Privacy and fff-divergences

3. Algorithmic Implementation and Analytical Moments Accountant (AMA)

4. Practical Applications in Differentially Private Learning

5. Recent Refinements: Optimal fff-divergence Conversion and Tighter Bounds

6. Comparative Empirical Performance and Practical Impact

7. Algorithmic and Numerical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

2. Relation to Rényi Differential Privacy and $f$ -divergences

5. Recent Refinements: Optimal $f$ -divergence Conversion and Tighter Bounds