Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Practical Differentially Private Hyperparameter Tuning with Subsampling (2301.11989v3)

Published 27 Jan 2023 in cs.LG and cs.CR

Abstract: Tuning the hyperparameters of differentially private (DP) ML algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized itself. Commonly, these algorithms still considerably increase the DP privacy parameter $\varepsilon$ over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by extrapolating the optimal values to a larger dataset. We provide a R\'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.

Citations (7)

Summary

  • The paper introduces a subsampling strategy for DP hyperparameter tuning that lowers privacy leakage and reduces computational cost.
  • It leverages RDP analysis to adjust the tuning search space and maintain consistent privacy guarantees.
  • The method achieves up to 6-8 times fewer gradient evaluations, offering significant practical computational savings.

Introduction

Hyperparameter tuning constitutes a critical stage in the development of ML models, ensuring optimal performance. However, when dealing with differentially private (DP) ML models, the process can prove challenging. The paper in discussion propels this conversation forward by presenting methods that focus on reducing both the computational and the privacy cost associated with hyperparameter tuning in DP ML models. The work is rooted in the premise that privacy preservation must encompass not just the model parameters, but also the hyperparameters, and offers an analysis within the Rényi Differential Privacy (RDP) framework.

Hyperparameter Tuning in DP ML Models

Hyperparameter tuning for DP ML models, particularly for Differentially Private Stochastic Gradient Descent (DP-SGD), typically comes with two main costs: computational, due to the numerous evaluations of hyperparameter candidates, and privacy-related, as a result of increase in the privacy budget ε. This challenge is tackled by proposing a selective usage of a random subset of the dataset for hyperparameter tuning, with an extrapolation of the optimal values for larger datasets. Additionally, this approach leverages existing RDP analyses of black-box tuning algorithms and tailors the privacy loss calculations to the subsampling strategy used.

Numerical Results and Insights

Experimentally validating the proposed method, the authors showcase its application on standard datasets, demonstrating improved privacy-utility trade-offs compared to established baseline methods. For instance, the tailored RDP analysis corroborates that using a smaller subset helps in managing lower DP privacy leakage, consequently yielding more favorable utility without exhaustive computational expense. Importantly, the paper quantifies these improvements, revealing that their subsampling approach can lead to up to 6-8 times fewer gradient evaluations compared to baselines, translating into significant computational savings.

Hyperparameters Impacting DP Guarantees

In the field of DP-SGD, hyperparameters like the noise level and the sampling ratio are intricate to the DP guarantees themselves. To handle this, the paper pioneers a framework that adjusts these hyperparameters to meet pre-defined DP targets. Through an evaluation grid or random sampling methodologies, the authors elucidate how to bind the search space and establish uniform RDP bounds for various candidate models, thus underpinning a harmonized yet efficient tuning process.

Conclusion and Future Work

The paper concludes by highlighting its contributions to the reduced computational and privacy costs in DP hyperparameter tuning. Pertinent algorithmic innovations are complemented by compelling experimental results, presenting a robust argument for the viability of the proposed methods. The overarching theme is a methodical synthesis of classical tuning with the specifics of privacies considerations. Furthermore, the researchers encourage future studies to expand these principles to other optimizers, potentially augmenting rigorous privacy analyses and developing generalized hyperparameter extrapolation techniques.