DP Stochastic Convex Optimization

Updated 10 December 2025

DP-SCO is the study of algorithms that minimize expected convex loss under differential privacy constraints, ensuring each data point has a negligible impact on the outcome.
The methodology includes techniques like calibrated noisy SGD, output perturbation, and exponential mechanisms that achieve near-optimal minimax excess risk rates over bounded convex functions.
Implications of DP-SCO include a clearly defined privacy-utility tradeoff that depends on sample size, data dimension, and privacy parameters, guiding privacy-aware algorithm design.

Differentially Private Stochastic Convex Optimization (DP-SCO) is the study of algorithms that minimize the expected loss of a convex function over a dataset of i.i.d. samples, subject to differential privacy (DP) constraints. In DP-SCO, the learning algorithm receives a sample from an unknown distribution and provides an output (model or hypothesis) such that the population risk is minimized while ensuring that any single data point (or, in the user-level setting, any block of data corresponding to one user) contributes negligibly to the output, in accordance with differential privacy.

1. Definition and Foundational Problem Setup

A DP-SCO instance consists of a convex loss function $f:\Theta\times\mathcal{Z}\rightarrow\mathbb{R}$ (with $\theta\mapsto f(\theta,z)$ convex for every $z$ ), a convex constraint set $\Theta\subseteq\mathbb{R}^d$ , and i.i.d. samples $S_n=(Z_1,...,Z_n)\sim \mathcal{D}^n$ . The statistical goal is to minimize the population risk

$F(\theta) = \mathbb{E}_{Z\sim\mathcal{D}}[f(\theta,Z)]$

by computing an output $\hat\theta\in\Theta$ with small excess risk: $\operatorname{Err} = \mathbb{E}_{S_n,\hat\theta}[F(\hat\theta)] - \min_{\theta\in\Theta} F(\theta)$ A randomized algorithm $\mathcal{A}_n$ is $(\varepsilon,\delta)$ -DP if for any two datasets $S$ and $S'$ differing in one data point, the distributions of $\mathcal{A}_n(S)$ and $\mathcal{A}_n(S')$ are close in the sense of differential privacy. Analogously, user-level DP requires privacy when datasets differ in one entire user's data block.

2. Minimax Rates and Core Utility–Privacy Tradeoffs

For Lipschitz convex losses over bounded domains ( $L$ -Lipschitz, diameter $D$ ), the minimax excess risk rate for $(\varepsilon,\delta)$ -DP is, up to logarithmic factors,

$O\left(\frac{LD}{\sqrt{n}} + \frac{LD\sqrt{d\ln(1/\delta)}}{\varepsilon n}\right)$

where $n$ is sample size, $d$ is dimension (Bassily et al., 2019). This rate is attained by several algorithmic strategies, notably calibrated noisy stochastic gradient descent (DP-SGD), output perturbation with sensitivity analysis, and variants of exponential mechanism or Gibbs sampling in general norms (Gopi et al., 2022).

For heavy-tailed losses (finite $k$ -th moment of gradient norms), rates interpolate: [ O\left(G_2\frac{1}{\sqrt{n}} + G_k\left(\frac{\sqrt

Markdown Report Issue Upgrade to Chat

References (2)

Private Stochastic Convex Optimization with Optimal Rates (2019)

Private Convex Optimization in General Norms (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DP Stochastic Convex Optimization (DP-SCO).