Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation (2303.01687v2)

Published 3 Mar 2023 in cs.LG, cs.CR, and cs.CV

Abstract: Maximum mean discrepancy (MMD) is a particularly useful distance metric for differentially private data generation: when used with finite-dimensional features it allows us to summarize and privatize the data distribution once, which we can repeatedly use during generator training without further privacy loss. An important question in this framework is, then, what features are useful to distinguish between real and synthetic data distributions, and whether those enable us to generate quality synthetic data. This work considers the using the features of $\textit{neural tangent kernels (NTKs)}$, more precisely $\textit{empirical}$ NTKs (e-NTKs). We find that, perhaps surprisingly, the expressiveness of the untrained e-NTK features is comparable to that of the features taken from pre-trained perceptual features using public data. As a result, our method improves the privacy-accuracy trade-off compared to other state-of-the-art methods, without relying on any public data, as demonstrated on several tabular and image benchmark datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Deep learning with differential privacy.  In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, p. 308–318, New York, NY, USA. Association for Computing Machinery.
  2. On gradient regularizers for MMD GANs.  In NeurIPS.
  3. Aronszajn, N. (1950). Theory of reproducing kernels.  Trans Am Math Soc, 68(3), 337–404.
  4. On exact computation with an infinitely wide neural net.  In NeurIPS.
  5. Harnessing the power of infinitely wide deep nets on small-data tasks.  In ICLR.
  6. Generalization through the lens of leave-one-out error.  In ICLR.
  7. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer New York.
  8. Demystifying MMD GANs.  In ICLR.
  9. Don’t generate me: Training differentially private generative models with Sinkhorn divergence.  In NeurIPS.
  10. GS-WGAN: A gradient-sanitized approach for learning differentially private generators.  In NeurIPS.
  11. Neural tangent kernel maximum mean discrepancy.  In NeurIPS.
  12. On lazy training in differentiable programming.  In NeurIPS.
  13. Differentially private diffusion models.  Transactions on Machine Learning Research.
  14. Our data, ourselves: Privacy via distributed noise generation.  In EUROCRYPT.
  15. Training generative neural networks via maximum mean discrepancy optimization.  In UAI.
  16. Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the neural tangent kernel.  In NeurIPS.
  17. A neural tangent kernel perspective of GANs.  In ICML.
  18. Generative adversarial nets.  In NeurIPS.
  19. A kernel two-sample test.  JMLR, 13(25), 723–773.
  20. DP-MERF: Differentially Private Mean Embeddings with Random Features for Practical Privacy-preserving Data Generation.  In AISTATS.
  21. Pre-trained perceptual features improve differentially private image generation.  Transactions on Machine Learning Research.
  22. A simple and practical algorithm for differentially private data release.  In NeurIPS.
  23. Neural tangent kernel: Convergence and generalization in neural networks.  In NeurIPS.
  24. Efficient statistical tests: A neural tangent kernel approach.  In ICML.
  25. Wide neural networks of any depth evolve as linear models under gradient descent.  In NeurIPS.
  26. MMD GAN: Towards Deeper Understanding of Moment Matching Network.  In NeurIPS.
  27. Generative moment matching networks.  In ICML.
  28. PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning.  In ICLR.
  29. Learning deep kernels for non-parametric two-sample tests.  In ICML.
  30. A kernel-based view of language model fine-tuning.. arXiv:2210.05643.
  31. Least squares generative adversarial networks.  In ICCV.
  32. Mehler, F. G. (1866). Ueber die entwicklung einer function von beliebig vielen variablen nach laplaceschen functionen höherer ordnung.  Journal für die reine und angewandte Mathematik.
  33. A fast, well-founded approximation to the empirical neural tangent kernel.. arXiv:2206.12543.
  34. Making look-ahead active learning strategies feasible with neural tangent kernels.  In NeurIPS. arXiv:2206.12569.
  35. Differentially private data release for data mining.  In KDD, New York, NY, USA.
  36. Kernel mean embedding of distributions: A review and beyond.  Foundations and Trends® in Machine Learning, 10(1-2), 1–141. arXiv:1605.09522.
  37. Neural tangents: Fast and easy infinite neural networks in python.  In ICLR.
  38. Pytorch: An imperative style, high-performance deep learning library.  In NeurIPS.
  39. Random features for large-scale kernel machines.  In NeurIPS.
  40. Better supervisory signals by observing learning paths.  In ICLR. arXiv:2203.02485.
  41. A Hilbert space embedding for distributions.  In ALT.
  42. Universality, characteristic kernels and rkhs embedding of measures..  Journal of Machine Learning Research, 12(7).
  43. Generative models and model criticism via optimized maximum mean discrepancy.  In ICLR.
  44. Dp-cgan: Differentially private synthetic data and label generation.  In CVPR Workshops.
  45. Hermite polynomial features for private data generation.  In ICML.
  46. Subsampled Rényi differential privacy and analytical moments accountant.  In AISTATS.
  47. More than a toy: Random matrix models predict how real-world neural representations generalize.  In ICML.
  48. Differentially private data release through multidimensional partitioning.  In Secure Data Management, pp. 150–168. Springer Berlin Heidelberg.
  49. Differentially private generative adversarial network.. arXiv:1802.06739.
  50. Yang, G. (2019). Tensor Programs I: Wide feedforward or recurrent neural networks of any architecture are Gaussian processes.  In NeurIPS.
  51. Tensor Programs IIb: Architectural universality of neural tangent kernel training dynamics.  In ICML.
  52. PATE-GAN: Generating synthetic data with differential privacy guarantees.  In ICLR.
  53. Single-level adversarial data synthesis based on neural tangent kernels.. arXiv:2204.04090.
  54. Differentially private data publishing and analysis: A survey.  IEEE Transactions on Knowledge and Data Engineering, 29(8), 1619–1638.
Citations (8)

Summary

We haven't generated a summary for this paper yet.