Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning (2403.14421v3)

Published 21 Mar 2024 in cs.LG, cs.CV, and cs.CR

Abstract: Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guarantees. Specifically, we assume access to a text-to-image diffusion model trained on a small amount of public data, and design a DP retrieval mechanism to augment the text prompt with samples retrieved from a private retrieval dataset. Our \emph{differentially private retrieval-augmented diffusion model} (DP-RDM) requires no fine-tuning on the retrieval dataset to adapt to another domain, and can use state-of-the-art generative models to generate high-quality image samples while satisfying rigorous DP guarantees. For instance, when evaluated on MS-COCO, our DP-RDM can generate samples with a privacy budget of $\epsilon=10$, while providing a $3.5$ point improvement in FID compared to public-only retrieval for up to $10,000$ queries.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
  2. Hypothesis testing interpretations and renyi differential privacy. In International Conference on Artificial Intelligence and Statistics, pages 2496–2506. PMLR, 2020.
  3. Retrieval-augmented diffusion models. Advances in Neural Information Processing Systems, 35:15309–15324, 2022.
  4. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pages 141–159. IEEE, 2021.
  5. Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5253–5270, 2023.
  6. Instance-conditioned GAN. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, 2021.
  7. Fine-tuning with differential privacy necessitates an additional hyperparameter search. arXiv preprint arXiv:2210.02156, 2022.
  8. Re-imagen: Retrieval-augmented text-to-image generator. arXiv preprint arXiv:2209.14491, 2022.
  9. A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
  10. Unlocking high-accuracy differentially private image classification through scale. arXiv preprint arXiv:2204.13650, 2022.
  11. Differentially private diffusion models. arXiv preprint arXiv:2210.09929, 2022.
  12. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
  13. Individual privacy accounting via a renyi filter. Advances in Neural Information Processing Systems, 34:28080–28091, 2021.
  14. Differentially private diffusion models generate useful synthetic images. arXiv preprint arXiv:2302.13861, 2023.
  15. Making ai forget you: Data deletion in machine learning. Advances in neural information processing systems, 32, 2019.
  16. Certified data removal from machine learning models. arXiv preprint arXiv:1911.03030, 2019.
  17. Pre-trained perceptual features improve differentially private image generation. Transactions on Machine Learning Research, 2023.
  18. Clipscore: A reference-free evaluation metric for image captioning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7514–7528, 2021.
  19. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  20. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  21. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence, 33(1):117–128, 2010.
  22. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  23. Learning multiple layers of features from tiny images. Technical report, University of Toronto, Toronto, Ontario, 2009.
  24. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  25. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  26. Large language models can be strong differentially private learners. arXiv preprint arXiv:2110.05679, 2021.
  27. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  28. Differentially private latent diffusion models. arXiv preprint arXiv:2305.15759, 2023.
  29. Large scale transfer learning for differentially private image classification. arXiv preprint arXiv:2205.02973, 2022.
  30. Ilya Mironov. Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF), pages 263–275. IEEE, 2017.
  31. Rényi differential privacy of the sampled gaussian mechanism. arXiv preprint arXiv:1908.10530, 2019.
  32. Reliable fidelity and diversity metrics for generative models. In International Conference on Machine Learning, pages 7176–7185. PMLR, 2020.
  33. On aliased resizing and surprising subtleties in gan evaluation. In CVPR, 2022.
  34. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  35. Alfréd Rényi. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, volume 4, pages 547–562. University of California Press, 1961.
  36. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  37. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  38. Tan without a burn: Scaling laws of dp-sgd. In International Conference on Machine Learning, pages 29937–29949. PMLR, 2023.
  39. Knn-diffusion: Image generation via large-scale retrieval. arXiv preprint arXiv:2204.02849, 2022.
  40. Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6048–6058, 2023.
  41. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  42. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  43. Information flow control in machine learning through modular model architecture. arXiv preprint arXiv:2306.03235, 2023.
  44. Rethinking privacy in machine learning pipelines from an information flow control perspective. arXiv preprint arXiv:2311.15792, 2023.
  45. Demystifying clip data. arXiv preprint arXiv:2309.16671, 2023.
  46. A study of face obfuscation in imagenet. In International Conference on Machine Learning (ICML), 2022.
  47. Retrieval-augmented multimodal language modeling. arXiv preprint arXiv:2211.12561, 2022.
  48. Opacus: User-friendly differential privacy library in pytorch. arXiv preprint arXiv:2109.12298, 2021.
  49. Differentially private fine-tuning of language models. In International Conference on Learning Representations, 2021.
  50. Vip: A differentially private foundation model for computer vision. arXiv preprint arXiv:2306.08842, 2023.
  51. Private-knn: Practical differential privacy for computer vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11854–11862, 2020.
  52. " private prediction strikes back!”private kernelized nearest neighbors with individual renyi filter. arXiv preprint arXiv:2306.07381, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com