The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation (2102.06387v4)

Published 12 Feb 2021 in cs.LG, cs.DS, and stat.ML

Abstract: We consider training models on private data that are distributed across user devices. To ensure privacy, we add on-device noise and use secure aggregation so that only the noisy sum is revealed to the server. We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. We provide a novel privacy analysis for sums of discrete Gaussians and carefully analyze the effects of data quantization and modular summation arithmetic. Our theoretical guarantees highlight the complex tension between communication, privacy, and accuracy. Our extensive experimental results demonstrate that our solution is essentially able to match the accuracy to central differential privacy with less than 16 bits of precision per value.

Citations (209)

View on Semantic Scholar

Summary

The paper introduces a novel system where each client discretizes model updates and adds discrete Gaussian noise before secure aggregation to protect individual contributions.
The authors provide a rigorous differential privacy analysis that guarantees privacy while achieving performance comparable to central Gaussian mechanisms with less than 16 bits per value.
The experimental validation on tasks like distributed mean estimation and federated learning demonstrates a communication-efficient approach that scales to large datasets while ensuring strong privacy.

Overview of the Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation

This paper presents a comprehensive end-to-end system that integrates federated learning (FL) with differential privacy (DP) and secure aggregation to ensure privacy when training models on data distributed across user devices. The system introduces the Distributed Discrete Gaussian (DDG) mechanism, focusing on effectively managing the trade-offs between communication, privacy, and accuracy within federated learning scenarios.

Key Contributions

Federated Learning System Design: The paper proposes a system where each client discretizes their model update and adds discrete Gaussian noise before sending it to the server using secure aggregation. This approach aims to balance utility and privacy by allowing the server to only access a noisy sum of the updates, preventing the inspection of individual client contributions.
Privacy Guarantee: Leveraging discrete Gaussian noise, the authors provide a novel privacy analysis for sums of discrete Gaussians. This analysis highlights the complex interplay between noise addition and client update discretization, ensuring that the system can meet stringent differential privacy requirements without leaking sensitive information.
Communication Efficiency: The system is designed to be communication-efficient, significantly reducing the number of bits required per value while maintaining high accuracy. The experimental results demonstrate that the proposed solution can match the accuracy of central differential privacy with less than 16 bits of precision per value.
Experimental Validation: Extensive experiments on tasks such as distributed mean estimation and federated learning on datasets like Federated EMNIST and Stack Overflow demonstrate the system's effectiveness. The results suggest that the DDG mechanism can nearly match the performance of traditional Gaussian mechanisms under varying communication and privacy constraints.

Implications and Future Directions

Practical Impact on FL: The integration of distributed DP and secure aggregation showcases a path toward implementing privacy-preserving machine learning in real-world applications, especially when data is highly sensitive.
Scalability and Flexibility: The proposed system, due to its efficient communication and computation requirements, has the potential to scale to large models and datasets while adhering to strong privacy guarantees.
Open Questions: The paper leaves several avenues open for future research, such as tightening amplification analysis results, exploring alternative mathematical transforms to avoid zero-padding issues, and developing adaptive algorithms that optimize system parameters dynamically.

In summary, this work provides a significant contribution to the intersection of federated learning and differential privacy, offering a solution that balances privacy with practical utility and efficiency. The introduction of discrete Gaussian mechanisms within a secure aggregation framework in FL represents an important step toward deploying AI systems that respect user privacy without sacrificing performance.

PDF Markdown

The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation (2102.06387v4)

Summary

Overview of the Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation

Key Contributions

Implications and Future Directions

Related Papers