FETA-Pro: DP Image Synthesis Framework
- FETA-Pro is a curriculum-based framework for differentially private image synthesis that introduces frequency features as an intermediate step between spatial and full-image stages.
- It employs a multi-stage pipeline combining diffusion models and a GAN auxiliary generator to enhance image fidelity and utility under strict privacy constraints.
- Empirical evaluations show FETA-Pro achieves lower FID scores and higher classification accuracy than previous DP methods across diverse datasets.
FETA-Pro (“From Easy to Hard++”) is a curriculum-based framework for differentially private (DP) image synthesis that introduces frequency features as an intermediate training stage between spatial features (central images) and full images. FETA-Pro employs a multi-stage generative pipeline utilizing diffusion models and a GAN auxiliary generator, achieving markedly higher fidelity and utility than previous public-free DP image synthesis methods—especially under stringent privacy constraints ()—across heterogeneous image domains (Gong et al., 10 Jan 2026).
1. Differentially Private Image Synthesis: Challenges and Predecessors
Differentially private image synthesis aims to generate synthetic images from a sensitive dataset such that the synthesis model satisfies -DP. The dominant optimization routine is DP-SGD, which enforces privacy via per-example gradient clipping (bound ) and additive Gaussian noise (), with composition tracked by RDP or moments accountant. Despite theoretical guarantees, DP-SGD suffers from poor convergence and low fidelity on complex, heterogeneous datasets due to the magnitude of injected noise proportional to sensitivity and privacy constraints (as evidenced by FID at in datasets such as CIFAR-10 and CelebA).
DP-FETA (Li et al., 2025) proposed a two-stage curriculum: initiating with “central images” (average-pooled images via a DP query) for coarse structure pretraining, then fine-tuning on real data via DP-SGD. This method improved outcomes on homogeneous datasets (e.g., MNIST), but its coarse spatial features—central images—contribute little structure for diversified datasets with high intra-class variation, revealing the need for a finer curriculum.
2. Frequency Features as a Curriculum Intermediate
Curriculum learning principles motivate progressively increasing data complexity for stable model discovery. In this context, traversing directly from spatial features (central images) to raw images provides too coarse a learning trajectory. FETA-Pro introduces frequency features as a “medium complexity” curriculum step. Given (flattened image), random Fourier features yield a DP-feasible embedding: The DP mean frequency vector is released via Gaussian mechanism: Empirically, entropy and texture metrics position frequency features strictly between central images and raw images. This motivates a curriculum:
- Spatial features (central images)
- Frequency features (random Fourier features)
- Full images
3. FETA-Pro Multi-Stage Pipeline
FETA-Pro decomposes training into three sequential stages, each aligning with a feature complexity scale:
a) Spatial Warm-up
Central (DP-queried) images are computed via clipped-mean perturbation. A diffusion model is trained with non-DP diffusion loss:
b) Frequency Warm-up with Auxiliary GAN
The DP mean frequency vector guides an auxiliary GAN generator , trained via an MMD-like loss—direct mean feature matching: produces images , which further warm up non-privately.
c) DP-SGD Fine-Tuning on Full Images
DP-SGD is employed on for private fine-tuning: Privacy across stages composes via RDP: . Conversion via standard mechanisms yields overall -DP.
Pipeline generation property: Each model specializes in a distinct feature domain (diffusion for spatial; GAN for frequency); outputs from one become inputs for the next, addressing domain shift and architectural mismatch.
4. Curriculum Scheduling and Pipeline Dynamics
The spatial–frequency curriculum is fixed:
- Train on central images;
- Train on GAN-generated images matching DP frequency statistics;
- Fine-tune on raw data with DP-SGD.
No per-batch mixing or interpolation is required; each phase sequentially “warms up” the model for the next. Warm-up durations are preset (e.g., 1,000 epochs on spatial, 10 on frequency-GAN). This staged schedule empirically accelerates convergence—FID decreases rapidly in DP-SGD fine-tuning compared to baselines.
5. Empirical Evaluations and Quantitative Performance
FETA-Pro was benchmarked against seven public-free baselines—including DP-FETA—on five datasets: MNIST, Fashion-MNIST, CIFAR-10, CelebA, and Camelyon histopathology. Quality is assessed by Fréchet Inception Distance (FID, lower is better) and downstream classification accuracy (Acc) of models trained on synthetic data, selected via “Report Noisy Max.”
Under , FETA-Pro achieves:
- 25.7% lower FID (higher fidelity)
- 4.1% higher Acc (greater utility)
Selected results (Table 6, left half):
| Dataset | FID (FETA-Pro vs DP-FETA) | Acc (FETA-Pro vs DP-FETA) |
|---|---|---|
| CIFAR-10 | 120.4 vs 139.8 (−13.9%) | 37.9% vs 35.1% (+2.8 pp) |
| CelebA | 48.0 vs 60.2 (−20.2%) | 90.0% vs 82.3% (+7.7 pp) |
| Camelyon | 31.0 vs 52.8 (−41.2%) | 84.0% vs 77.3% (+6.7 pp) |
Convergence rates also improve; FETA-Pro attains minimum FID in fewer epochs during DP-SGD fine-tuning.
6. Ablations, Hyperparameter Sweeps, and Practical Considerations
Ablation studies show that frequency-only warm-up outperforms spatial-only, but the sequential “spatial→frequency” curriculum in FETA-Pro yields the best results. GAN auxiliaries for frequency feature inversion outperform diffusion or no auxiliary model. Optimal privacy budgeting assigns larger RDP cost () to frequency features than spatial, but less than DP-SGD: On CIFAR-10, FETA-Pro incurs only 0.06 hours (≈0.3%) additional runtime and no extra peak GPU memory over DP-FETA.
Principal limitations include sensitivity to privacy budget allocation and the exclusion of DP cost for hyperparameter tuning (as per standard DP benchmarks). Future work may investigate alternative training shortcuts beyond spatial and frequency features.
7. Significance and Scope within Differential Privacy
FETA-Pro advances curriculum-based DP image synthesis by (1) introducing frequency features as an intermediate step, (2) employing a GAN auxiliary for frequency inversion, (3) leveraging a pipeline of specialized generative models, and (4) supporting fine privacy accounting through RDP composition. These innovations yield measurable gains in fidelity, utility, and training efficiency, expanding the applicability of DP image synthesis to heterogeneous domains while respecting strict privacy constraints (Gong et al., 10 Jan 2026).