Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion (2305.04288v3)
Abstract: Federated learning (FL) enables participating parties to collaboratively build a global model with boosted utility without disclosing private data information. Appropriate protection mechanisms have to be adopted to fulfill the requirements in preserving \textit{privacy} and maintaining high model \textit{utility}. The nature of the widely-adopted protection mechanisms including \textit{Randomization Mechanism} and \textit{Compression Mechanism} is to protect privacy via distorting model parameter. We measure the utility via the gap between the original model parameter and the distorted model parameter. We want to identify under what general conditions privacy-preserving federated learning can achieve near-optimal utility via data generation and parameter distortion. To provide an avenue for achieving near-optimal utility, we present an upper bound for utility loss, which is measured using two main terms called variance-reduction and model parameter discrepancy separately. Our analysis inspires the design of appropriate protection parameters for the protection mechanisms to achieve near-optimal utility and meet the privacy requirements simultaneously. The main techniques for the protection mechanism include parameter distortion and data generation, which are generic and can be applied extensively. Furthermore, we provide an upper bound for the trade-off between privacy and utility, \blue{which together with the lower bound provided by no free lunch theorem in federated learning (\cite{zhang2022no}) form the conditions for achieving optimal trade-off.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, NY, USA, 308–318.
- G.R. Blakley. 1979. Safeguarding cryptographic keys. In Proceedings of the 1979 AFIPS National Computer Conference. AFIPS Press, Monval, NJ, USA, 313–317.
- Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1175–1191.
- Mitigating Data Heterogeneity in Federated Learning with Data Augmentation. arXiv preprint arXiv:2206.09979 (2022).
- Flávio du Pin Calmon and Nadia Fawaz. 2012. Privacy against statistical inference. In 2012 50th annual Allerton conference on communication, control, and computing (Allerton). IEEE, 1401–1408.
- Local privacy, data processing inequalities, and minimax rates. arXiv preprint arXiv:1302.3203 (2013).
- Inverting Gradients–How easy is it to break privacy in federated learning? arXiv preprint arXiv:2003.14053 (2020).
- Craig Gentry. 2009. A fully homomorphic encryption scheme. Stanford university.
- Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 (2017).
- Federated Deep Learning with Bayesian Privacy. arXiv preprint arXiv:2109.13012 (2021).
- Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications 116 (2018), 1–8.
- Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference. 148–162.
- Enhanced security and privacy via fragmented federated learning. IEEE Transactions on Neural Networks and Learning Systems (2022).
- Privacy-preserving federated adversarial domain adaption over feature groups for interpretability. arXiv preprint arXiv:2111.10934 (2021).
- Yigitcan Kaya and Tudor Dumitras. 2021. When Does Data Augmentation Help With Membership Inference Attacks?. In International conference on machine learning. PMLR, 5345–5355.
- Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).
- Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).
- Ali Makhdoumi and Nadia Fawaz. 2013. Privacy-utility tradeoff under statistical uncertainty. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 1627–1634.
- Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.
- Federated learning of deep networks using model averaging. arXiv preprint arXiv:1602.05629 (2016).
- Fast federated learning by balancing communication trade-offs. IEEE Transactions on Communications 69, 8 (2021), 5168–5182.
- Irving S Reed. 1973. Information theory and privacy in data banks. In Proceedings of the June 4-8, 1973, national computer conference and exposition. 581–587.
- Sina Sajadmanesh and Daniel Gatica-Perez. 2021. Locally private graph neural networks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2130–2145.
- Utility-privacy tradeoffs in databases: An information-theoretic approach. IEEE Transactions on Information Forensics and Security 8, 6 (2013), 838–852.
- Adi Shamir. 1979. How to Share a Secret. Commun. ACM 22, 11 (nov 1979), 612–613. https://doi.org/10.1145/359168.359176
- LDP-Fed: Federated learning with local differential privacy. In Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking. 61–66.
- Hirosuke Yamamoto. 1983. A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers (corresp.). IEEE Transactions on Information Theory 29, 6 (1983), 918–923.
- Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.
- Federated learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 13, 3 (2019), 1–207.
- See through Gradients: Image Batch Recovery via GradInversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16337–16346.
- BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 493–506. https://www.usenix.org/conference/atc20/presentation/zhang-chengliang
- A Game-theoretic Framework for Federated Learning. arXiv preprint arXiv:2304.05836 (2023).
- No free lunch theorem for security and utility in federated learning. arXiv preprint arXiv:2203.05816 (2022).
- Probably approximately correct federated learning. arXiv preprint arXiv:2304.04641 (2023).
- Trading Off Privacy, Utility and Efficiency in Federated Learning. arXiv preprint arXiv:2209.00230 (2022).
- Trading Off Privacy, Utility, and Efficiency in Federated Learning. ACM Transactions on Intelligent Systems and Technology 14, 6 (2023), 1–32.
- Theoretically Principled Federated Learning for Balancing Privacy and Utility. arXiv preprint arXiv:2305.15148 (2023).
- idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020).
- Ligeng Zhu and Song Han. 2020. Deep leakage from gradients. In Federated Learning. Springer, 17–31.
- Deep Leakage from Gradients. In Annual Conference on Neural Information Processing Systems (NeurIPS).