DP-FL: Federated Learning with Differential Privacy
- Federated Learning with Differential Privacy (DP-FL) is a framework that combines decentralized model training with rigorous privacy guarantees to protect individual data.
- It utilizes techniques such as noise injection, gradient clipping, and privacy accounting to ensure that updates remain secure while preserving model accuracy.
- DP-FL finds critical applications in sensitive domains like healthcare, finance, and mobile computing where strict data privacy and governance are required.
Federated Learning with Differential Privacy (DP-FL) is a paradigm that fuses large-scale distributed learning with rigorous, mathematically quantifiable privacy guarantees. It enables multiple clients or organizations to collaboratively train machine learning models while ensuring that no sensitive information can be directly inferred about any individual’s data, either within or across participating parties. The core technical mechanism is the integration of differential privacy (DP)—via noise injection, clipping, and advanced privacy accounting—at various stages of the federated optimization procedure. DP-FL has become foundational for privacy-conscious ML in healthcare, finance, mobile/on-device intelligence, and any setting with strict data-governance regimes.
1. Formal Foundations: Federated Learning and Differential Privacy
Federated Learning Protocol
Federated learning (FL) coordinates clients, each with their own private data , to jointly optimize a global objective without centralizing data. At every global round , a random subset is selected. Clients download the current global model , perform steps of local SGD on their loss , and upload their updates . The server aggregates: Standard aggregation is weighted averaging (FedAvg).
Differential Privacy Guarantee
A randomized mechanism 0 is 1-DP if for every pair of neighboring datasets 2 (differing in at most one sample or user) and all measurable 3: 4 Here 5 controls the worst-case privacy loss; 6 allows a small failure probability. Sensitivity 7 is key for noise calibration. The Gaussian mechanism achieves 8-DP for output 9 with 0, 1 (Ren et al., 2024, Sen et al., 2024).
2. Taxonomy of DP-FL Paradigms
DP-FL can be categorized by where noise is injected, what is protected, and the trust model (Ren et al., 2024, Sen et al., 2024):
| Paradigm | Noise Injection Location | Protection Granularity |
|---|---|---|
| Central DP (Server-level) | Server-side, after aggregation | Entire client/user |
| Local DP (LDP) | Client-side, pre-aggregation | Each client’s update |
| Shuffle Model | Client-side + shuffling proxy | Near-central, removes linkage |
| Secure Aggregation-based | After secure sum over clients | Matches central, server-untrusted |
- Central DP: Server clips and noises updates to hide full client contributions. Sampling amplifies DP.
- Local DP: Each client privatizes its update before sending, usually leading to heavy utility loss, especially for high-dimensional models.
- Shuffle Model: Clients use small LDP noise; a shuffler permits privacy amplification by breaking source-linkage.
- Secure Aggregation-based: Clients add distributed noise; the server learns only the (noisy) sum, closely matching central DP accuracy with improved trust assumptions.
3. Optimization Principles: Mechanisms, Calibration, and Accounting
DP-FL Core Algorithm
- DP-SGD for FL: Each client computes and clips gradient 2, sending
3
Aggregated update: [ \tilde{g}t = \frac{1}{m} \sum