Papers
Topics
Authors
Recent
Search
2000 character limit reached

FETA-Pro: DP Image Synthesis Framework

Updated 17 January 2026
  • FETA-Pro is a curriculum-based framework for differentially private image synthesis that introduces frequency features as an intermediate step between spatial and full-image stages.
  • It employs a multi-stage pipeline combining diffusion models and a GAN auxiliary generator to enhance image fidelity and utility under strict privacy constraints.
  • Empirical evaluations show FETA-Pro achieves lower FID scores and higher classification accuracy than previous DP methods across diverse datasets.

FETA-Pro (“From Easy to Hard++”) is a curriculum-based framework for differentially private (DP) image synthesis that introduces frequency features as an intermediate training stage between spatial features (central images) and full images. FETA-Pro employs a multi-stage generative pipeline utilizing diffusion models and a GAN auxiliary generator, achieving markedly higher fidelity and utility than previous public-free DP image synthesis methods—especially under stringent privacy constraints (ε=1\varepsilon=1)—across heterogeneous image domains (Gong et al., 10 Jan 2026).

1. Differentially Private Image Synthesis: Challenges and Predecessors

Differentially private image synthesis aims to generate synthetic images D^\widehat D from a sensitive dataset DsD_s such that the synthesis model satisfies (ε,δ)(\varepsilon, \delta)-DP. The dominant optimization routine is DP-SGD, which enforces privacy via per-example gradient clipping (bound CC) and additive Gaussian noise (N(0,σd2C2I)\mathcal N(0, \sigma_d^2 C^2 I)), with composition tracked by RDP or moments accountant. Despite theoretical guarantees, DP-SGD suffers from poor convergence and low fidelity on complex, heterogeneous datasets due to the magnitude of injected noise proportional to sensitivity and privacy constraints (as evidenced by FID 100\gg100 at ε=1\varepsilon=1 in datasets such as CIFAR-10 and CelebA).

DP-FETA (Li et al., 2025) proposed a two-stage curriculum: initiating with “central images” (average-pooled images via a DP query) for coarse structure pretraining, then fine-tuning on real data via DP-SGD. This method improved outcomes on homogeneous datasets (e.g., MNIST), but its coarse spatial features—central images—contribute little structure for diversified datasets with high intra-class variation, revealing the need for a finer curriculum.

2. Frequency Features as a Curriculum Intermediate

Curriculum learning principles motivate progressively increasing data complexity for stable model discovery. In this context, traversing directly from spatial features (central images) to raw images provides too coarse a learning trajectory. FETA-Pro introduces frequency features as a “medium complexity” curriculum step. Given hRdh \in \mathbb{R}^d (flattened image), random Fourier features yield a DP-feasible embedding: ϕ(h)=[2Kcos(ω1h),...,2Ksin(ωK/2h)],ωjN(0,Id)\phi(h) = [\sqrt{\frac{2}{K}}\cos(\omega_1^\top h), ..., \sqrt{\frac{2}{K}}\sin(\omega_{K/2}^\top h)]^\top, \quad \omega_j\sim\mathcal{N}(0,I_d) The DP mean frequency vector is released via Gaussian mechanism: μ=1Ni=1Nϕ(hi),μ~=μ+N(0,σf2Δf2IK),Δf=1/N\mu = \frac{1}{N}\sum_{i=1}^{N}\phi(h_i),\quad \tilde{\mu} = \mu + \mathcal{N}(0,\,\sigma_f^2 \Delta_f^2 I_K), \quad \Delta_f = 1/N Empirically, entropy and texture metrics position frequency features strictly between central images and raw images. This motivates a curriculum:

  1. Spatial features (central images)
  2. Frequency features (random Fourier features)
  3. Full images

3. FETA-Pro Multi-Stage Pipeline

FETA-Pro decomposes training into three sequential stages, each aligning with a feature complexity scale:

a) Spatial Warm-up

Central (DP-queried) images {h~jspat}\{\tilde h^{\mathrm{spat}}_j\} are computed via clipped-mean perturbation. A diffusion model eθe_\theta is trained with non-DP diffusion loss: LDM=Eh0,t,etϵtϵθ(ht,t)2\mathcal L_{\rm DM} = \mathbb E_{h_0, t, e_t}\|\epsilon_t - \epsilon_\theta(h_t, t)\|^2

b) Frequency Warm-up with Auxiliary GAN

The DP mean frequency vector μ~\tilde\mu guides an auxiliary GAN generator Gf(z;θ)G_f(z;\theta'), trained via an MMD-like loss—direct mean feature matching: L(θ)=μ~1Bfi=1Bfϕ(Gf(zi;θ))2\mathcal L(\theta') = \left\|\tilde\mu - \frac{1}{B_f}\sum_{i=1}^{B_f}\phi(G_f(z_i; \theta'))\right\|_2 GfG_f produces images {hif}\{h_i^f\}, which further warm up eθe_\theta non-privately.

c) DP-SGD Fine-Tuning on Full Images

DP-SGD is employed on DsD_s for private fine-tuning: θθη(1Bi=1BClip(θLDM(hi),Cd)+CdBN(0,σd2I)),B=qdDs\theta \leftarrow \theta - \eta\left(\frac{1}{B^*}\sum_{i=1}^{B} \text{Clip}(\nabla_\theta \mathcal L_{\rm DM}(h_i), C_d) + \frac{C_d}{B^*}\mathcal N(0, \sigma_d^2 I)\right),\quad B^* = q_d|D_s| Privacy across stages composes via RDP: (α,γt+γf+γd)(\alpha, \gamma_t + \gamma_f + \gamma_d). Conversion via standard mechanisms yields overall (ε,δ)(\varepsilon, \delta)-DP.

Pipeline generation property: Each model specializes in a distinct feature domain (diffusion for spatial; GAN for frequency); outputs from one become inputs for the next, addressing domain shift and architectural mismatch.

4. Curriculum Scheduling and Pipeline Dynamics

The spatial–frequency curriculum is fixed:

  1. Train on central images;
  2. Train on GAN-generated images matching DP frequency statistics;
  3. Fine-tune on raw data with DP-SGD.

No per-batch mixing or interpolation is required; each phase sequentially “warms up” the model for the next. Warm-up durations are preset (e.g., 1,000 epochs on spatial, 10 on frequency-GAN). This staged schedule empirically accelerates convergence—FID decreases rapidly in DP-SGD fine-tuning compared to baselines.

5. Empirical Evaluations and Quantitative Performance

FETA-Pro was benchmarked against seven public-free baselines—including DP-FETA—on five datasets: MNIST, Fashion-MNIST, CIFAR-10, CelebA, and Camelyon histopathology. Quality is assessed by Fréchet Inception Distance (FID, lower is better) and downstream classification accuracy (Acc) of models trained on synthetic data, selected via “Report Noisy Max.”

Under ε=1\varepsilon=1, FETA-Pro achieves:

  • 25.7% lower FID (higher fidelity)
  • 4.1% higher Acc (greater utility)

Selected results (Table 6, left half):

Dataset FID (FETA-Pro vs DP-FETA) Acc (FETA-Pro vs DP-FETA)
CIFAR-10 120.4 vs 139.8 (−13.9%) 37.9% vs 35.1% (+2.8 pp)
CelebA 48.0 vs 60.2 (−20.2%) 90.0% vs 82.3% (+7.7 pp)
Camelyon 31.0 vs 52.8 (−41.2%) 84.0% vs 77.3% (+6.7 pp)

Convergence rates also improve; FETA-Pro attains minimum FID in fewer epochs during DP-SGD fine-tuning.

6. Ablations, Hyperparameter Sweeps, and Practical Considerations

Ablation studies show that frequency-only warm-up outperforms spatial-only, but the sequential “spatial→frequency” curriculum in FETA-Pro yields the best results. GAN auxiliaries for frequency feature inversion outperform diffusion or no auxiliary model. Optimal privacy budgeting assigns larger RDP cost (γ\gamma) to frequency features than spatial, but less than DP-SGD: γspatial<γfrequency<γDPSGD\gamma_{\mathrm{spatial}} < \gamma_{\mathrm{frequency}} < \gamma_{\mathrm{DP-SGD}} On CIFAR-10, FETA-Pro incurs only 0.06 hours (≈0.3%) additional runtime and no extra peak GPU memory over DP-FETA.

Principal limitations include sensitivity to privacy budget allocation and the exclusion of DP cost for hyperparameter tuning (as per standard DP benchmarks). Future work may investigate alternative training shortcuts beyond spatial and frequency features.

7. Significance and Scope within Differential Privacy

FETA-Pro advances curriculum-based DP image synthesis by (1) introducing frequency features as an intermediate step, (2) employing a GAN auxiliary for frequency inversion, (3) leveraging a pipeline of specialized generative models, and (4) supporting fine privacy accounting through RDP composition. These innovations yield measurable gains in fidelity, utility, and training efficiency, expanding the applicability of DP image synthesis to heterogeneous domains while respecting strict privacy constraints (Gong et al., 10 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FETA-Pro.