Secure Summation via Subset Sums: A New Primitive for Privacy-Preserving Distributed Machine Learning (1906.11993v2)
Abstract: For population studies or for the training of complex machine learning models, it is often required to gather data from different actors. In these applications, summation is an important primitive: for computing means, counts or mini-batch gradients. In many cases, the data is privacy-sensitive and therefore cannot be collected on a central server. Hence the summation needs to be performed in a distributed and privacy-preserving way. Existing solutions for distributed summation with computational privacy guarantees make trust or connection assumptions - e.g., the existence of a trusted server or peer-to-peer connections between clients - that might not be fulfilled in real world settings. Motivated by these challenges, we propose Secure Summation via Subset Sums (S5), a method for distributed summation that works in the presence of a malicious server and only two honest clients, and without the need for peer-to-peer connections between clients. S5 adds zero-sum noise to clients' messages and shuffles them before sending them to the aggregating server. Our main contribution is a proof that this scheme yields a computational privacy guarantee based on the multidimensional subset sum problem. Our analysis of this problem may be of independent interest for other privacy and cryptography applications.
- Beimel, A.: Secret-sharing schemes: a survey. In: International Conference on Coding and Cryptology. pp. 11–46 (2011)
- Goldreich, O.: Foundations of Cryptology: Basic Tools. Cambridge (2001)
- Kabanets, V.: Lecture notes in computability & complexity. http://www2.cs.sfu.ca/~kabanets/308/lectures/lec12.pdf (2016)