Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Fundamental Limits of Least-Privilege Learning

Published 19 Feb 2024 in cs.LG and cs.CR | (2402.12235v2)

Abstract: The promise of least-privilege learning -- to find feature representations that are useful for a learning task but prevent inference of any sensitive information unrelated to this task -- is highly appealing. However, so far this concept has only been stated informally. It thus remains an open question whether and how we can achieve this goal. In this work, we provide the first formalisation of the least-privilege principle for machine learning and characterise its feasibility. We prove that there is a fundamental trade-off between a representation's utility for a given task and its leakage beyond the intended task: it is not possible to learn representations that have high utility for the intended task but, at the same time prevent inference of any attribute other than the task label itself. This trade-off holds under realistic assumptions on the data distribution and regardless of the technique used to learn the feature mappings that produce these representations. We empirically validate this result for a wide range of learning techniques, model architectures, and datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Deep variational information bottleneck. In International Conference on Learning Representations, 2016.
  2. Tabnet: Attentive interpretable tabular learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
  3. Arimoto, S. Information measures and capacity of order α𝛼\alphaitalic_α for discrete memoryless channels. Topics in information theory, 1977.
  4. Bottleneck problems: An information and estimation-theoretic view. Entropy, 22(11):1325, 2020.
  5. Reconstructing individual data points in federated learning hardened with differential privacy and secure aggregation. In IEEE S&P, 2023.
  6. Least privilege learning for attribute obfuscation. In Pattern Recognition, 2022.
  7. Bayes security: A not so average metric. In 36th IEEE Computer Security Foundations Symposium, CSF 2023, Dubrovnik, Croatia, July 10-14, 2023, pp.  388–406. IEEE, 2023.
  8. Privacy partitioning: Protecting user data during the deep learning inference phase. arXiv preprint arXiv:1812.02863, 2018.
  9. Privacy harms. BUL Rev., 102:793, 2022.
  10. Flexibly fair representation learning by disentanglement. In ICML, 2019.
  11. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
  12. European Parliament and Council of the European Union. Regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation). https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN, 2016.
  13. Fischer, I. The conditional entropy bottleneck. Entropy, 22(9):999, 2020.
  14. Property inference attacks on fully connected neural networks using permutation invariant representations. In ACM SIGSAC Conference on Computer and Communications Security, 2018.
  15. Hao, K. How Apple personalizes Siri without hoovering up your data. https://www.technologyreview.com/2019/12/11/131629/apple-ai-personalizes-siri-federated-learning/, 2019.
  16. Deep residual learning for image recognition. arXiv, 2015.
  17. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on faces in’Real-Life’Images: detection, alignment, and recognition, 2008.
  18. An operational approach to information leakage. IEEE Transactions on Information Theory, 66(3):1625–1657, 2019.
  19. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Adult, 2013.
  20. Tunable measures for information leakage and applications to privacy-utility tradeoffs. IEEE Transactions on Information Theory, 2019.
  21. Federated learning: Collaborative machine learning without centralized training data. http://ai.googleblog.com/2017/04/federated-learning-collaborative.html, 2017.
  22. Exploiting Unintended Feature Leakage in Collaborative Learning. In SP, 2019.
  23. Layer-wise Characterization of Latent Information Leakage in Federated Learning. arXiv:2010.08762, 2021.
  24. Robust de-anonymization of large sparse datasets: a decade later. https://www.cs.princeton.edu/~arvindn/publications/de-anonymization-retrospective.pdf, 2019.
  25. Nissenbaum, H. Privacy as contextual integrity. Wash. L. Rev., 79:119, 2004.
  26. Deep private-feature extraction. IEEE Transactions on Knowledge and Data Engineering, 2018.
  27. Gradient Reversal Against Discrimination. In FACCT, 2018.
  28. Ai and the everything in the whole wide world benchmark. In NeurIPS Datasets and Benchmarks Track, 2021.
  29. White-box vs black-box: Bayes optimal strategies for membership inference. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, Proceedings of Machine Learning Research. PMLR, 2019.
  30. The protection of information in computer systems. Proceedings of the IEEE, 1975.
  31. Overlearning Reveals Sensitive Attributes. In NIPS, 2019.
  32. Learning from noisy labels with deep neural networks: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  33. Texas Hospital Inpatient Discharge Public Use Data File 2013 Q1-Q4. https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtm, 2013. Accessed 2020-06-01.
  34. The information bottleneck method. arXiv preprint physics/0004057, 2000.
  35. Not just privacy: Improving performance of private deep learning in mobile cloud. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.
  36. Trade-offs and guarantees of adversarial representation learning for information obfuscation. NeurIPS, 2020.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 22 likes about this paper.