AdaSSP: Adaptive Sufficient-Statistics Perturbation
- AdaSSP is a differentially private linear regression technique that adaptively calibrates noise based on the input covariance matrix's curvature.
- It employs instance-specific ridge regularization and data-aware noise allocation to stabilize estimates in ill-conditioned or high-dimensional settings.
- Empirical evaluations demonstrate near-optimal performance and improved efficiency compared to traditional sufficient-statistics perturbation methods.
Adaptive Sufficient-Statistics Perturbation (AdaSSP) is an advanced method for differentially private (DP) linear regression that achieves near-optimal performance by calibrating privacy noise to data-dependent characteristics of the input, particularly the curvature of the sample covariance matrix. AdaSSP is characterized by its use of adaptive, instance-specific ridge regularization and data-aware noise allocation, which address the instability and suboptimality of prior sufficient-statistics perturbation (SSP) methods, especially in ill-conditioned or high-dimensional settings (Lev et al., 12 Jan 2026, Wang, 2018, Ferrando et al., 2024).
1. Problem Formulation and Context
The core problem addressed by AdaSSP is differentially private ordinary least squares (DP-OLS) regression. Given a design matrix (rows ) and a response vector (entries ), with each and , and assuming (full or near full rank), the task is to estimate
subject to -differential privacy. The privacy notion is defined under "zero-out" neighbors: datasets differing in a single replaced by the zero vector (Lev et al., 12 Jan 2026).
Classical non-adaptive SSP releases and with data-independent Gaussian noise calibrated to their global sensitivities. However, if is nearly singular, the estimator's variance can be arbitrarily inflated. AdaSSP addresses these failures by privately estimating spectral properties and tuning the regularization parameter to current data (Wang, 2018).
2. AdaSSP Algorithmic Structure
AdaSSP instantiates the following procedure (Lev et al., 12 Jan 2026, Wang, 2018, Ferrando et al., 2024):
- Privacy budget splitting: The total is divided equally (, and likewise for ) among:
- A private estimate of the minimum eigenvalue .
- Noisy release of the empirical covariance .
- Noisy release of the cross-moment .
- Private estimation: Adds calibrated Gaussian noise to (scale ), truncates to nonnegativity with a debiasing shift.
- Adaptive ridge selection: Sets the regularization parameter to
with a confidence parameter (e.g., $0.05$), balancing stabilization and minimax optimality.
- Noisy sufficient statistics release: Adds symmetric Gaussian noise to (scale ) and to (scale ).
- Private estimator computation: Outputs
All downstream computations are pure post-processing, incurring no additional privacy cost (Lev et al., 12 Jan 2026, Ferrando et al., 2024).
3. Sensitivity Analysis and Noise Calibration
AdaSSP's privacy mechanisms use the Gaussian mechanism, requiring precise sensitivity estimates:
- Sensitivity of : (Frobenius norm).
- Sensitivity of : .
- Sensitivity for the smallest eigenvalue: (from Weyl's inequality).
Noise scales are then derived as: applied separately for each release (Ferrando et al., 2024, Wang, 2018).
4. Differential Privacy and Composition
AdaSSP's privacy guarantee follows from composition of Gaussian mechanisms:
- Each statistic release is -DP.
- By advanced (or simple) composition, the aggregate procedure is -DP with , (Lev et al., 12 Jan 2026).
- All subsequent steps (ridge selection, inversion) are post-processing.
Empirical best practices apply analytic calibration to minimize noise for each mechanism (Lev et al., 12 Jan 2026).
5. Utility Analysis and Optimality
AdaSSP achieves rates that match data-dependent minimax lower bounds for excess risk:
- For empirical risk ,
for poorly-conditioned design, or
otherwise, with
Under standard Gaussian linear models, estimation error achieves asymptotic efficiency (matching Cramér–Rao up to ) (Wang, 2018). AdaSSP adapts to the regime (Lipschitz or strongly convex) and recovers optimal rates in both.
6. Practical Considerations and Extensions
- The only tuning parameter is , which may be set via internal criteria to ensure invertibility or by cross-validation on the privatized statistics.
- Internal confidence () affects graceful tradeoff between utility and risk.
- Standardization of features () is commonly performed. Empirical studies use settings such as , (Wang, 2018).
- AdaSSP requires only norm bounds () and does not need user-side hyperparameter tuning.
- Approximate sufficient-statistics perturbation extends to logistic regression via Chebyshev approximation of the log-likelihood, where noisy and suffice for quadratic approximations (Ferrando et al., 2024).
7. Comparative Evaluation and Empirical Performance
Extensive experiments on 36 benchmark UCI data sets demonstrate that AdaSSP uniformly improves over non-adaptive SSP and objective perturbation (ObjPert) at small and moderate privacy budgets, and matches or exceeds output perturbation sampling (OPS) and noisy SGD in nearly all cases. In Gaussian regime simulation, AdaSSP achieves near-1 relative efficiency at moderate ("privacy for free") (Wang, 2018).
AdaSSP's performance degrades gracefully with increasing , decreasing , or ill-conditioning. It remains the leading baseline among sufficient-statistics perturbation approaches unless the design is extremely well-conditioned or the residual error is negligible (Lev et al., 12 Jan 2026).
References:
- "Near-Optimal Private Linear Regression via Iterative Hessian Mixing" (Lev et al., 12 Jan 2026)
- "Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain" (Wang, 2018)
- "Private Regression via Data-Dependent Sufficient Statistic Perturbation" (Ferrando et al., 2024)