Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Learning with Label Differential Privacy by Vector Approximation (2405.15150v1)

Published 24 May 2024 in cs.LG

Abstract: Label differential privacy (DP) is a framework that protects the privacy of labels in training datasets, while the feature vectors are public. Existing approaches protect the privacy of labels by flipping them randomly, and then train a model to make the output approximate the privatized label. However, as the number of classes $K$ increases, stronger randomization is needed, thus the performances of these methods become significantly worse. In this paper, we propose a vector approximation approach, which is easy to implement and introduces little additional computational overhead. Instead of flipping each label into a single scalar, our method converts each label into a random vector with $K$ components, whose expectations reflect class conditional probabilities. Intuitively, vector approximation retains more information than scalar labels. A brief theoretical analysis shows that the performance of our method only decays slightly with $K$. Finally, we conduct experiments on both synthesized and real datasets, which validate our theoretical analysis as well as the practical performance of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
  2. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054–1067. 2014.
  3. Collecting telemetry data privately. Advances in Neural Information Processing Systems, 30, 2017.
  4. Privacy loss in apple’s implementation of differential privacy on macos 10.12. arXiv preprint arXiv:1709.02753, 2017.
  5. Near, J. Differential privacy at scale: Uber and berkeley collaboration. In Enigma 2018 (Enigma 2018). 2018.
  6. Differentially private recommender systems: Building privacy into the netflix prize contenders. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 627–636. 2009.
  7. Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1222–1230. 2013.
  8. Trust, identity, privacy, and security considerations for designing a peer data sharing platform between people living with hiv. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2):1–27, 2020.
  9. Deep learning with label differential privacy. Advances in Neural Information Processing Systems, 34:27131–27145, 2021.
  10. Antipodes of label differential privacy: Pate and alibi. Advances in Neural Information Processing Systems, 34:6934–6945, 2021.
  11. Label differential privacy via clustering. In International Conference on Artificial Intelligence and Statistics, pages 7055–7075. PMLR, 2022.
  12. Machine learning with differentially private labels: Mechanisms and frameworks. Proceedings on Privacy Enhancing Technologies, 2022.
  13. Geopointgan: Synthetic spatial data with local label differential privacy. CoRR, abs/2205.08886, 2022.
  14. Warner, S. L. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
  15. Differential privacy as a mutual information constraint. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 43–54. 2016.
  16. On the relation between identifiability, differential privacy, and mutual-information privacy. IEEE Transactions on Information Theory, 62(9):5018–5029, 2016.
  17. Calibration tests in multi-class classification: A unifying framework. Advances in neural information processing systems, 32, 2019.
  18. Calibrating predictions to decisions: A novel approach to multi-class calibration. Advances in Neural Information Processing Systems, 34:22313–22324, 2021.
  19. Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration. Advances in neural information processing systems, 32, 2019.
  20. Deep learning model calibration for improving performance in class-imbalanced medical image classification tasks. PloS one, 17(1):e0262838, 2022.
  21. Regression with label differential privacy. In The Eleventh International Conference on Learning Representations. 2022.
  22. Optimal unbiased randomizers for regression with label differential privacy. In Thirty-seventh Conference on Neural Information Processing Systems. 2023.
  23. Does label differential privacy prevent label inference attacks? In International Conference on Artificial Intelligence and Statistics, vol. 206 of Proceedings of Machine Learning Research, pages 4336–4347. 2023.
  24. Label differential privacy and private training data release. In International Conference on Machine Learning, vol. 202, pages 3233–3251. 2023.
  25. Private learning with public features. In International Conference on Artificial Intelligence and Statistics, vol. 238, pages 4150–4158. 2024.
  26. Training differentially private ad prediction models with semi-sensitive features. CoRR, abs/2401.15246, 2024.
  27. Sample complexity bounds for differentially private learning. In Proceedings of the 24th Annual Conference on Learning Theory, pages 155–186. 2011.
  28. Discrete distribution estimation under local privacy. In International Conference on Machine Learning, pages 2436–2444. PMLR, 2016.
  29. On calibration of modern neural networks. In International conference on machine learning, pages 1321–1330. PMLR, 2017.
  30. LeCun, Y. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
  31. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  32. Learning multiple layers of features from tiny images. Tech. rep., University of Toronto, 2009.
  33. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations. 2020.
  34. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318. 2016.
  35. Rates of convergence for nearest neighbor classification. Advances in Neural Information Processing Systems, 27, 2014.
  36. Rate of convergence of k𝑘kitalic_k-nearest-neighbor classification rule. Journal of Machine Learning Research, 18(227):1–16, 2018.
  37. Classification in general finite dimensional spaces with the k-nearest neighbor rule. Annals of Statistics, 2016.
  38. Minimax rate optimal adaptive nearest neighbor classification and regression. IEEE Transactions on Information Theory, 67(5):3155–3182, 2021.
  39. Local nearest neighbour classification with applications to semi-supervised learning. The Annals of Statistics, 48(3):1789–1814, 2020.
  40. Efficient classification with adaptive knn. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pages 11007–11014. 2021.
  41. Foundations of machine learning. MIT press, 2018.
  42. Barron, A. R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
  43. Why deep neural networks for function approximation? In International Conference on Learning Representations. 2016.
  44. Yarotsky, D. Optimal approximation of continuous functions by very deep relu networks. In Conference on learning theory, pages 639–649. PMLR, 2018.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets