Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Does Differentially Private Learning Not Suffer in High Dimensions? (2207.00160v4)

Published 1 Jul 2022 in cs.LG, cs.CR, and stat.ML

Abstract: Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models. A common theme in these results is the surprising observation that high-dimensional models can achieve favorable privacy-utility trade-offs. This seemingly contradicts known results on the model-size dependence of differentially private convex learning and raises the following research question: When does the performance of differentially private learning not degrade with increasing model size? We identify that the magnitudes of gradients projected onto subspaces is a key factor that determines performance. To precisely characterize this for private convex learning, we introduce a condition on the objective that we term \emph{restricted Lipschitz continuity} and derive improved bounds for the excess empirical and population risks that are dimension-independent under additional conditions. We empirically show that in private fine-tuning of LLMs, gradients obtained during fine-tuning are mostly controlled by a few principal components. This behavior is similar to conditions under which we obtain dimension-independent bounds in convex settings. Our theoretical and empirical results together provide a possible explanation for recent successes in large-scale private fine-tuning. Code to reproduce our results can be found at \url{https://github.com/lxuechen/private-transformers/tree/main/examples/classification/spectral_analysis}.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xuechen Li (35 papers)
  2. Daogao Liu (34 papers)
  3. Tatsunori Hashimoto (80 papers)
  4. Huseyin A. Inan (23 papers)
  5. Janardhan Kulkarni (52 papers)
  6. Yin Tat Lee (102 papers)
  7. Abhradeep Guha Thakurta (7 papers)
Citations (54)