Training generative models from privatized data (2306.09547v2)
Abstract: Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport - a popular regularization method in the literature that has often been leveraged for its computational benefits - enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples. We prove that at the same time this leads to fast statistical convergence at the parametric rate. This shows that entropic regularization of optimal transport uniquely enables the mitigation of both the effects of privatization noise and the curse of dimensionality in statistical convergence. We provide experimental evidence to support the efficacy of our framework in practice.
- C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pp. 265–284, Springer, 2006.
- S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
- X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, and C. Theobalt, “Drag your gan: Interactive point-based manipulation on the generative image manifold,” in ACM SIGGRAPH 2023 Conference Proceedings, pp. 1–11, 2023.
- W. W. Booker, D. D. Ray, and D. R. Schrider, “This population does not exist: learning the distribution of evolutionary histories with generative adversarial networks,” Genetics, vol. 224, no. 2, p. iyad063, 2023.
- E. R. Chan, C. Z. Lin, M. A. Chan, K. Nagano, B. Pan, S. De Mello, O. Gallo, L. J. Guibas, J. Tremblay, S. Khamis, et al., “Efficient geometry-aware 3d generative adversarial networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16123–16133, 2022.
- M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in neural information processing systems, vol. 26, pp. 2292–2300, 2013.
- D. Reshetova, Y. Bai, X. Wu, and A. Özgür, “Understanding entropic regularization in gans,” in 2021 IEEE International Symposium on Information Theory (ISIT), pp. 825–830, IEEE, 2021.
- G. Mena and J. Niles-Weed, “Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem,” in Advances in Neural Information Processing Systems, pp. 4541–4551, 2019.
- S. Feizi, F. Farnia, T. Ginart, and D. Tse, “Understanding gans in the lqg setting: Formulation, generalization and stability,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 304–311, 2020.
- T. M. Cover, Elements of information theory. John Wiley & Sons, 1999.
- R. Bassily, K. Nissim, U. Stemmer, and A. Guha Thakurta, “Practical locally private heavy hitters,” Advances in Neural Information Processing Systems, vol. 30, 2017.
- M. Bun, J. Nelson, and U. Stemmer, “Heavy hitters and the structure of local privacy,” ACM Transactions on Algorithms (TALG), vol. 15, no. 4, pp. 1–40, 2019.
- W.-N. Chen, P. Kairouz, and A. Ozgur, “Breaking the dimension dependence in sparse distribution estimation under communication constraints,” in Conference on Learning Theory, pp. 1028–1059, PMLR, 2021.
- W.-N. Chen, P. Kairouz, and A. Ozgur, “Breaking the communication-privacy-accuracy trilemma,” Advances in Neural Information Processing Systems, vol. 33, pp. 3312–3324, 2020.
- A. T. Suresh, X. Y. Felix, S. Kumar, and H. B. McMahan, “Distributed mean estimation with limited communication,” in International conference on machine learning, pp. 3329–3337, PMLR, 2017.
- A. Bhowmick, J. Duchi, J. Freudiger, G. Kapoor, and R. Rogers, “Protection against reconstruction and its applications in private federated learning,” arXiv preprint arXiv:1812.00984, 2018.
- Y. Han, P. Mukherjee, A. Ozgur, and T. Weissman, “Distributed statistical estimation of high-dimensional and nonparametric distributions,” in 2018 IEEE International Symposium on Information Theory (ISIT), pp. 506–510, IEEE, 2018.
- D. Chen, T. Orekondy, and M. Fritz, “Gs-wgan: A gradient-sanitized approach for learning differentially private generators,” Advances in Neural Information Processing Systems, vol. 33, pp. 12673–12684, 2020.
- T. Cao, A. Bie, A. Vahdat, S. Fidler, and K. Kreis, “Don’t generate me: Training differentially private generative models with sinkhorn divergence,” Advances in Neural Information Processing Systems, vol. 34, pp. 12480–12492, 2021.
- L. Xie, K. Lin, S. Wang, F. Wang, and J. Zhou, “Differentially private generative adversarial network,” arXiv preprint arXiv:1802.06739, 2018.
- X. Zhang, S. Ji, and T. Wang, “Differentially private releasing via deep generative model (technical report),” arXiv preprint arXiv:1801.01594, 2018.
- M. Arjovsky and L. Bottou, “Towards principled methods for training generative adversarial networks,” arXiv preprint arXiv:1701.04862, 2017.
- L. Mescheder, A. Geiger, and S. Nowozin, “Which training methods for gans do actually converge?,” in International conference on machine learning, pp. 3481–3490, PMLR, 2018.
- A. Mansbridge, G. Barbour, D. Piras, C. Frye, I. Feige, and D. Barber, “Learning to noise: Application-agnostic data sharing with local differential privacy,” arXiv preprint arXiv:2010.12464, 2020.
- G. Peyré, M. Cuturi, et al., “Computational optimal transport: With applications to data science,” Foundations and Trends® in Machine Learning, vol. 11, no. 5-6, pp. 355–607, 2019.
- A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyré, “Sample complexity of sinkhorn divergences,” in The 22nd international conference on artificial intelligence and statistics, pp. 1574–1583, PMLR, 2019.
- C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, “Our data, ourselves: Privacy via distributed noise generation,” in Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28-June 1, 2006. Proceedings 25, pp. 486–503, Springer, 2006.
- C. Dwork, A. Roth, et al., “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.
- M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning, pp. 214–223, PMLR, 2017.
- A. Korotin, V. Egiazarian, A. Asadulaev, A. Safin, and E. Burnaev, “Wasserstein-2 generative networks,” arXiv preprint arXiv:1909.13082, 2019.
- P. Rigollet and J. Weed, “Entropic optimal transport is maximum-likelihood deconvolution,” Comptes Rendus Mathematique, vol. 356, no. 11, pp. 1228–1235, 2018.
- Z. Goldfeld, K. Greenewald, J. Niles-Weed, and Y. Polyanskiy, “Convergence of smoothed empirical measures with applications to entropy estimation,” IEEE Transactions on Information Theory, vol. 66, no. 7, pp. 4368–4391, 2020.
- A. J. Stromme, “Minimum intrinsic dimension scaling for entropic optimal transport,” arXiv preprint arXiv:2306.03398, 2023.
- G. Luise, M. Pontil, and C. Ciliberto, “Generalization properties of optimal transport gans with latent distribution learning,” arXiv preprint arXiv:2007.14641, 2020.
- A. Genevay, G. Peyré, and M. Cuturi, “Learning generative models with sinkhorn divergences,” in International Conference on Artificial Intelligence and Statistics, pp. 1608–1617, PMLR, 2018.
- R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer, “Pot: Python optimal transport,” Journal of Machine Learning Research, vol. 22, no. 78, pp. 1–8, 2021.
- Y. LeCun, “The mnist database of handwritten digits,” http://yann. lecun. com/exdb/mnist/, 1998.
- A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
- V. Feldman, A. McMillan, and K. Talwar, “Stronger privacy amplification by shuffling for rényi and approximate differential privacy,” in Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 4966–4981, SIAM, 2023.
- S. Mallat, A wavelet tour of signal processing. Elsevier, 1999.
- A. Krull, T.-O. Buchholz, and F. Jug, “Noise2void-learning denoising from single noisy images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2129–2137, 2019.
- J. Feydy, T. Séjourné, F.-X. Vialard, S.-i. Amari, A. Trouvé, and G. Peyré, “Interpolating between optimal transport and mmd using sinkhorn divergences,” in The 22nd International Conference on Artificial Intelligence and Statistics, pp. 2681–2690, PMLR, 2019.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
- I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of wasserstein gans,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 5769–5779, 2017.
- A. Bie, G. Kamath, and G. Zhang, “Private gans, revisited,” arXiv preprint arXiv:2302.02936, 2023.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
- I. Csiszár, “I-divergence geometry of probability distributions and minimization problems,” The annals of probability, pp. 146–158, 1975.
- A.-A. Pooladian and J. Niles-Weed, “Entropic estimation of optimal transport maps,” arXiv preprint arXiv:2109.12004, 2021.
- MIT press, 2018.
- J. Mercer, “Xvi. functions of positive and negative type, and their connection the theory of integral equations,” Philosophical transactions of the royal society of London. Series A, containing papers of a mathematical or physical character, vol. 209, no. 441-458, pp. 415–446, 1909.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.