AdaSSP: Adaptive Sufficient-Statistics Perturbation

Updated 19 January 2026

AdaSSP is a differentially private linear regression technique that adaptively calibrates noise based on the input covariance matrix's curvature.
It employs instance-specific ridge regularization and data-aware noise allocation to stabilize estimates in ill-conditioned or high-dimensional settings.
Empirical evaluations demonstrate near-optimal performance and improved efficiency compared to traditional sufficient-statistics perturbation methods.

Adaptive Sufficient-Statistics Perturbation (AdaSSP) is an advanced method for differentially private (DP) linear regression that achieves near-optimal performance by calibrating privacy noise to data-dependent characteristics of the input, particularly the curvature of the sample covariance matrix. AdaSSP is characterized by its use of adaptive, instance-specific ridge regularization and data-aware noise allocation, which address the instability and suboptimality of prior sufficient-statistics perturbation (SSP) methods, especially in ill-conditioned or high-dimensional settings (Lev et al., 12 Jan 2026, Wang, 2018, Ferrando et al., 2024).

1. Problem Formulation and Context

The core problem addressed by AdaSSP is differentially private ordinary least squares (DP-OLS) regression. Given a design matrix $X \in \mathbb{R}^{n \times d}$ (rows $x_i$ ) and a response vector $Y \in \mathbb{R}^n$ (entries $y_i$ ), with each $\|x_i\|_2 \leq C_x$ and $|y_i| \leq C_y$ , and assuming $n \geq d$ (full or near full rank), the task is to estimate

$\theta^* = (X^T X)^{-1} X^T Y$

subject to $(\epsilon, \delta)$ -differential privacy. The privacy notion is defined under "zero-out" neighbors: datasets differing in a single $(x_i, y_i)$ replaced by the zero vector (Lev et al., 12 Jan 2026).

Classical non-adaptive SSP releases $X^T X$ and $X^T Y$ with data-independent Gaussian noise calibrated to their global sensitivities. However, if $X^T X$ is nearly singular, the estimator's variance can be arbitrarily inflated. AdaSSP addresses these failures by privately estimating spectral properties and tuning the regularization parameter to current data (Wang, 2018).

2. AdaSSP Algorithmic Structure

AdaSSP instantiates the following procedure (Lev et al., 12 Jan 2026, Wang, 2018, Ferrando et al., 2024):

Privacy budget splitting: The total $(\epsilon, \delta)$ $(ϵ, δ)$ is divided equally ( $\epsilon_1 = \epsilon_2 = \epsilon_3 = \epsilon / 3$ , and likewise for $\delta$ $δ$ ) among:
- A private estimate of the minimum eigenvalue $\lambda_{\min}(X^T X)$ .
- Noisy release of the empirical covariance $G = X^T X$ .
- Noisy release of the cross-moment $h = X^T Y$ .
Private $\lambda_{\min}$ estimation: Adds calibrated Gaussian noise to $\lambda_{\min}(G)$ (scale $\sigma_\lambda = C_x^2 \sqrt{2\log(1.25/\delta_1)}/\epsilon_1$ ), truncates to nonnegativity with a debiasing shift.
Adaptive ridge selection: Sets the regularization parameter $\lambda$ to

$\lambda = \max \left\{0, \frac{\sqrt{p \log(2p^2/\rho)} C_x^2}{\epsilon_2 / \sqrt{2\log(1.25/\delta_2)}} - \tilde{\lambda}_{\min}\right\}$

with $\rho$ a confidence parameter (e.g., $0.05$), balancing stabilization and minimax optimality.

Noisy sufficient statistics release: Adds symmetric Gaussian noise to $X^T X$ (scale $\sigma_G = C_x^2 \sqrt{2\log(1.25/\delta_2)}/\epsilon_2$ ) and to $X^T Y$ (scale $\sigma_h = C_x C_y \sqrt{2\log(1.25/\delta_3)}/\epsilon_3$ ).
Private estimator computation: Outputs

$\hat{\theta} = (\widetilde{G} + \lambda I)^{-1} \widetilde{h}$

All downstream computations are pure post-processing, incurring no additional privacy cost (Lev et al., 12 Jan 2026, Ferrando et al., 2024).

3. Sensitivity Analysis and Noise Calibration

AdaSSP's privacy mechanisms use the Gaussian mechanism, requiring precise sensitivity estimates:

Sensitivity of $G = X^T X$ : $\Delta_F(G) \leq C_x^2$ (Frobenius norm).
Sensitivity of $h = X^T Y$ : $\Delta_2(h) \leq C_x C_y$ .
Sensitivity for the smallest eigenvalue: $\|X\|^2$ (from Weyl's inequality).

Noise scales are then derived as: $\sigma = \Delta \cdot \sqrt{2\log(1.25/\delta)} / \epsilon$ applied separately for each release (Ferrando et al., 2024, Wang, 2018).

4. Differential Privacy and Composition

AdaSSP's privacy guarantee follows from composition of Gaussian mechanisms:

Each statistic release is $(\epsilon_i, \delta_i)$ -DP.
By advanced (or simple) composition, the aggregate procedure is $(\epsilon, \delta)$ -DP with $\epsilon = \epsilon_1 + \epsilon_2 + \epsilon_3$ , $\delta = \delta_1 + \delta_2 + \delta_3$ (Lev et al., 12 Jan 2026).
All subsequent steps (ridge selection, inversion) are post-processing.

Empirical best practices apply analytic calibration to minimize noise for each mechanism (Lev et al., 12 Jan 2026).

5. Utility Analysis and Optimality

AdaSSP achieves rates that match data-dependent minimax lower bounds for excess risk:

For empirical risk $R(X, Y, \hat{\theta}) = L(\hat{\theta}) - L(\theta^*)$ ,

$R(X, Y, \hat{\theta}) \leq O\left( \frac{Y_{\text{ASSP}} (C_y^2 + \|\theta^*\|_2^2)}{\lambda_{\min}(X^T X)} \right)$

for poorly-conditioned design, or

$O\left( Y_{\text{ASSP}} (C_y^2 + \|\theta^*\|_2^2) \right)$

otherwise, with

$Y_{\text{ASSP}} = O\left( \frac{ \sqrt{d\log(d^2/\delta)\log(1/\delta)} }{ \epsilon } \right)$

(Lev et al., 12 Jan 2026).

Under standard Gaussian linear models, estimation error achieves asymptotic efficiency (matching Cramér–Rao up to $1/\epsilon^2$ ) (Wang, 2018). AdaSSP adapts to the regime (Lipschitz or strongly convex) and recovers optimal rates in both.

6. Practical Considerations and Extensions

The only tuning parameter is $\lambda$ , which may be set via internal criteria to ensure invertibility or by cross-validation on the privatized statistics.
Internal confidence ( $\rho$ ) affects graceful tradeoff between utility and risk.
Standardization of features ( $\max_i \|x_i\| = 1$ ) is commonly performed. Empirical studies use settings such as $\epsilon \in \{0.1, 1\}$ , $\delta = \min\{10^{-6}, 1/n^2\}$ (Wang, 2018).
AdaSSP requires only norm bounds ( $C_x, C_y$ ) and does not need user-side hyperparameter tuning.
Approximate sufficient-statistics perturbation extends to logistic regression via Chebyshev approximation of the log-likelihood, where noisy $X^T X$ and $X^T Y$ suffice for quadratic approximations (Ferrando et al., 2024).

7. Comparative Evaluation and Empirical Performance

Extensive experiments on 36 benchmark UCI data sets demonstrate that AdaSSP uniformly improves over non-adaptive SSP and objective perturbation (ObjPert) at small and moderate privacy budgets, and matches or exceeds output perturbation sampling (OPS) and noisy SGD in nearly all cases. In Gaussian regime simulation, AdaSSP achieves near-1 relative efficiency at moderate $n$ ("privacy for free") (Wang, 2018).

AdaSSP's performance degrades gracefully with increasing $d$ , decreasing $\epsilon$ , or ill-conditioning. It remains the leading baseline among sufficient-statistics perturbation approaches unless the design is extremely well-conditioned or the residual error $\|Y - X\theta^*\|^2$ is negligible (Lev et al., 12 Jan 2026).

References:

"Near-Optimal Private Linear Regression via Iterative Hessian Mixing" (Lev et al., 12 Jan 2026)
"Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain" (Wang, 2018)
"Private Regression via Data-Dependent Sufficient Statistic Perturbation" (Ferrando et al., 2024)