Cross-Silo Federated Learning Across Divergent Domains with Iterative Parameter Alignment
Abstract: Learning from the collective knowledge of data dispersed across private sources can provide neural networks with enhanced generalization capabilities. Federated learning, a method for collaboratively training a machine learning model across remote clients, achieves this by combining client models via the orchestration of a central server. However, current approaches face two critical limitations: i) they struggle to converge when client domains are sufficiently different, and ii) current aggregation techniques produce an identical global model for each client. In this work, we address these issues by reformulating the typical federated learning setup: rather than learning a single global model, we learn N models each optimized for a common objective. To achieve this, we apply a weighted distance minimization to model parameters shared in a peer-to-peer topology. The resulting framework, Iterative Parameter Alignment, applies naturally to the cross-silo setting, and has the following properties: (i) a unique solution for each participant, with the option to globally converge each model in the federation, and (ii) an optional early-stopping mechanism to elicit fairness among peers in collaborative learning settings. These characteristics jointly provide a flexible new framework for iteratively learning from peer models trained on disparate datasets. We find that the technique achieves competitive results on a variety of data partitions compared to state-of-the-art approaches. Further, we show that the method is robust to divergent domains (i.e. disjoint classes across peers) where existing approaches struggle.
- T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine learning and systems, vol. 2, pp. 429–450, 2020.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–1282.
- Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
- Y. Guo, Y. Sun, R. Hu, and Y. Gong, “Hybrid local sgd for federated learning with heterogeneous communications,” in International Conference on Learning Representations, 2021.
- P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
- D. A. E. Acar, Y. Zhao, R. Matas, M. Mattina, P. Whatmough, and V. Saligrama, “Federated learning based on dynamic regularization,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=B7v4QMR6Z9w
- L. Gao, H. Fu, L. Li, Y. Chen, M. Xu, and C.-Z. Xu, “Feddc: Federated learning with non-iid data via local drift decoupling and correction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 112–10 121.
- S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International Conference on Machine Learning. PMLR, 2020, pp. 5132–5143.
- “FedSyn: Federated learning meets Blockchain.” [Online]. Available: https://www.jpmorgan.com/technology/federated-learning-meets-blockchain/
- editor2fedai, “WeBank and Swiss Re signed Cooperation MoU.” [Online]. Available: https://www.fedai.org/news/webank-and-swiss-re-signed-cooperation-mou/
- I. Dayan, H. R. Roth, A. Zhong, A. Harouni, A. Gentili, A. Z. Abidin, A. Liu, A. B. Costa, B. J. Wood, C.-S. Tsai et al., “Federated learning for predicting clinical outcomes in patients with covid-19,” Nature medicine, vol. 27, no. 10, pp. 1735–1743, 2021.
- J. Ogier du Terrail, A. Leopold, C. Joly, C. Béguier, M. Andreux, C. Maussion, B. Schmauch, E. W. Tramel, E. Bendjebbar, M. Zaslavskiy, G. Wainrib, M. Milder, J. Gervasoni, J. Guerin, T. Durand, A. Livartowski, K. Moutet, C. Gautier, I. Djafar, A.-L. Moisson, C. Marini, M. Galtier, F. Balazard, R. Dubois, J. Moreira, A. Simon, D. Drubay, M. Lacroix-Triki, C. Franchet, G. Bataillon, and P.-E. Heudel, “Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer,” Nature Medicine, vol. 29, no. 1, pp. 135–146, Jan. 2023, number: 1 Publisher: Nature Publishing Group. [Online]. Available: https://www.nature.com/articles/s41591-022-02155-w
- S. Silva, B. A. Gutman, E. Romero, P. M. Thompson, A. Altmann, and M. Lorenzi, “Federated Learning in Distributed Medical Databases: Meta-Analysis of Large-Scale Subcortical Brain Data,” in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Apr. 2019, pp. 270–274, iSSN: 1945-8452.
- M. Flores, “NVIDIA Blogs: NVIDIA Blogs: AI models for Mammogram Assessment,” Apr. 2020. [Online]. Available: https://blogs.nvidia.com/blog/2020/04/15/federated-learning-mammogram-assessment/
- C. Huang, J. Huang, and X. Liu, “Cross-silo federated learning: Challenges and opportunities,” arXiv preprint arXiv:2206.12949, 2022.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
- A. Fallah, A. Mokhtari, and A. Ozdaglar, “Personalized federated learning: A meta-learning approach,” arXiv preprint arXiv:2002.07948, 2020.
- M. Zhang, K. Sapra, S. Fidler, S. Yeung, and J. M. Alvarez, “Personalized federated learning with first order model optimization,” arXiv preprint arXiv:2012.08565, 2020.
- T. Li, S. Hu, A. Beirami, and V. Smith, “Ditto: Fair and robust federated learning through personalization,” in International Conference on Machine Learning. PMLR, 2021, pp. 6357–6368.
- H.-Y. Chen and W.-L. Chao, “On bridging generic and personalized federated learning for image classification,” arXiv preprint arXiv:2107.00778, 2021.
- H. Chen, C. Wang, and H. Vikalo, “The best of both worlds: Accurate global and personalized models through federated learning with data-free hyper-knowledge distillation,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=29V3AWjVAFi
- K. Liu, S. Hu, S. Wu, and V. Smith, “On privacy and personalization in cross-silo federated learning,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=Oq2bdIQQOIZ
- R. C. Geyer, T. Klein, and M. Nabi, “Differentially private federated learning: A client level perspective,” arXiv preprint arXiv:1712.07557, 2017.
- N. Agarwal, P. Kairouz, and Z. Liu, “The skellam mechanism for differentially private federated learning,” Advances in Neural Information Processing Systems, vol. 34, pp. 5052–5064, 2021.
- P. Kairouz, Z. Liu, and T. Steinke, “The distributed discrete gaussian mechanism for federated learning with secure aggregation,” in International Conference on Machine Learning. PMLR, 2021, pp. 5201–5212.
- C. Zhang, S. Li, J. Xia, W. Wang, F. Yan, and Y. Liu, “Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning,” in Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 2020), 2020.
- Z. Jiang, W. Wang, and Y. Liu, “Flashe: Additively symmetric homomorphic encryption for cross-silo federated learning,” arXiv preprint arXiv:2109.00675, 2021.
- K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, and H. V. Poor, “Federated learning with differential privacy: Algorithms and performance analysis,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020.
- X. Xu, L. Lyu, X. Ma, C. Miao, C. S. Foo, and B. K. H. Low, “Gradient driven rewards to guarantee fairness in collaborative machine learning,” Advances in Neural Information Processing Systems, vol. 34, pp. 16 104–16 117, 2021.
- L. Lyu, J. Yu, K. Nandakumar, Y. Li, X. Ma, J. Jin, H. Yu, and K. S. Ng, “Towards fair and privacy-preserving federated deep models,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 11, pp. 2524–2541, 2020.
- L. Lyu, X. Xu, Q. Wang, and H. Yu, “Collaborative fairness in federated learning,” Federated Learning: Privacy and Incentive, pp. 189–204, 2020.
- O. Marfoq, C. Xu, G. Neglia, and R. Vidal, “Throughput-optimal topology design for cross-silo federated learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 19 478–19 487, 2020.
- A. Khaled, K. Mishchenko, and P. Richtárik, “Tighter theory for local sgd on identical and heterogeneous data,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2020, pp. 4519–4529.
- B. E. Woodworth, J. Wang, A. Smith, B. McMahan, and N. Srebro, “Graph oracle models, lower bounds, and gaps for parallel stochastic optimization,” Advances in neural information processing systems, vol. 31, 2018.
- J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency,” arXiv preprint arXiv:1610.05492, 2016.
- M. A. Heikkilä, A. Koskela, K. Shimizu, S. Kaski, and A. Honkela, “Differentially private cross-silo federated learning,” arXiv preprint arXiv:2007.05553, 2020.
- J. Kang, Z. Xiong, D. Niyato, S. Xie, and J. Zhang, “Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory,” IEEE Internet of Things Journal, vol. 6, no. 6, pp. 10 700–10 714, 2019.
- J. Kang, Z. Xiong, D. Niyato, H. Yu, Y.-C. Liang, and D. I. Kim, “Incentive design for efficient federated learning in mobile networks: A contract theory approach,” in 2019 IEEE VTS Asia Pacific Wireless Communications Symposium (APWCS). IEEE, 2019, pp. 1–5.
- H. Yu, Z. Liu, Y. Liu, T. Chen, M. Cong, X. Weng, D. Niyato, and Q. Yang, “A sustainable incentive scheme for federated learning,” IEEE Intelligent Systems, vol. 35, no. 4, pp. 58–69, 2020.
- K. Donahue and J. Kleinberg, “Optimality and stability in federated learning: A game-theoretic approach,” Advances in Neural Information Processing Systems, vol. 34, pp. 1287–1298, 2021.
- A. Blum, N. Haghtalab, R. L. Phillips, and H. Shao, “One for one, or all for all: Equilibria and optimality of collaboration in federated learning,” in International Conference on Machine Learning. PMLR, 2021, pp. 1005–1014.
- Z. Liu, Y. Chen, H. Yu, Y. Liu, and L. Cui, “Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 4, may 2022. [Online]. Available: https://doi.org/10.1145/3501811
- A. Shamsian, A. Navon, E. Fetaya, and G. Chechik, “Personalized federated learning using hypernetworks,” in International Conference on Machine Learning. PMLR, 2021, pp. 9489–9502.
- S. Vahidian, M. Morafah, and B. Lin, “Personalized federated learning by structured and unstructured pruning under data heterogeneity,” in 2021 IEEE 41st International Conference on Distributed Computing Systems Workshops (ICDCSW), 2021, pp. 27–34.
- V. Smith, C.-K. Chiang, M. Sanjabi, and A. S. Talwalkar, “Federated multi-task learning,” Advances in neural information processing systems, vol. 30, 2017.
- Y. Huang, L. Chu, Z. Zhou, L. Wang, J. Liu, J. Pei, and Y. Zhang, “Personalized cross-silo federated learning on non-iid data,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 9, 2021, pp. 7865–7873.
- P. P. Liang, T. Liu, L. Ziyin, N. B. Allen, R. P. Auerbach, D. Brent, R. Salakhutdinov, and L.-P. Morency, “Think locally, act globally: Federated learning with local and global representations,” arXiv preprint arXiv:2001.01523, 2020.
- L. Collins, H. Hassani, A. Mokhtari, and S. Shakkottai, “Exploiting shared representations for personalized federated learning,” in International Conference on Machine Learning. PMLR, 2021, pp. 2089–2099.
- M. Mohri, G. Sivek, and A. T. Suresh, “Agnostic federated learning,” in International Conference on Machine Learning. PMLR, 2019, pp. 4615–4625.
- M. S. Matena and C. A. Raffel, “Merging models with fisher-weighted averaging,” Advances in Neural Information Processing Systems, vol. 35, pp. 17 703–17 716, 2022.
- J. Wang, A. K. Sahu, Z. Yang, G. Joshi, and S. Kar, “Matcha: Speeding up decentralized sgd via matching decomposition sampling,” in 2019 Sixth Indian Control Conference (ICC). IEEE, 2019, pp. 299–300.
- T.-W. Weng, P. Zhao, S. Liu, P.-Y. Chen, X. Lin, and L. Daniel, “Towards certificated model robustness against weight perturbations,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, 2020, pp. 6356–6363.
- P. Zhao, S. Wang, C. Gongye, Y. Wang, Y. Fei, and X. Lin, “Fault sneaking attack: A stealthy framework for misleading deep neural networks,” in Proceedings of the 56th Annual Design Automation Conference 2019, ser. DAC ’19. New York, NY, USA: Association for Computing Machinery, 2019. [Online]. Available: https://doi.org/10.1145/3316781.3317825
- Y. Liu, L. Wei, B. Luo, and Q. Xu, “Fault injection attack on deep neural network,” in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017, pp. 131–138.
- B. Neyshabur, S. Bhojanapalli, D. McAllester, and N. Srebro, “Exploring generalization in deep learning,” Advances in neural information processing systems, vol. 30, 2017.
- N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On large-batch training for deep learning: Generalization gap and sharp minima,” arXiv preprint arXiv:1609.04836, 2016.
- I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: Training neural networks with low precision weights and activations,” The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6869–6898, 2017.
- T. Li, M. Sanjabi, A. Beirami, and V. Smith, “Fair resource allocation in federated learning,” arXiv preprint arXiv:1905.10497, 2019.
- T. Song, Y. Tong, and S. Wei, “Profit allocation for federated learning,” in 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019, pp. 2577–2586.
- J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” arXiv preprint arXiv:1803.03635, 2018.
- M. Gorbett, H. Shirazi, and I. Ray, “Sparse binary transformers for multivariate time series modeling,” in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 544–556.
- M. Gorbett and D. Whitley, “Randomly initialized subnetworks with iterative weight recycling,” arXiv preprint arXiv:2303.15953, 2023.
- M. Gorbett, H. Shirazi, and I. Ray, “Wip: the intrinsic dimensionality of iot networks,” in Proceedings of the 27th ACM on Symposium on Access Control Models and Technologies, 2022, pp. 245–250.
- ——, “Local intrinsic dimensionality of iot networks for unsupervised intrusion detection,” in IFIP Annual Conference on Data and Applications Security and Privacy. Springer, 2022, pp. 143–161.
- M. Gorbett, C. Siebert, H. Shirazi, and I. Ray, “The intrinsic dimensionality of network datasets and its applications 1,” Journal of Computer Security, no. Preprint, pp. 1–26.
- C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.
- M. Jagielski, J. Ullman, and A. Oprea, “Auditing differentially private machine learning: How private is private sgd?” Advances in Neural Information Processing Systems, vol. 33, pp. 22 205–22 216, 2020.
- H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang, “Learning differentially private recurrent language models,” arXiv preprint arXiv:1710.06963, 2017.
- P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” in Advances in Cryptology—EUROCRYPT’99: International Conference on the Theory and Application of Cryptographic Techniques Prague, Czech Republic, May 2–6, 1999 Proceedings 18. Springer, 1999, pp. 223–238.
- I. Lazrig, T. Ong, I. Ray, I. Ray, X. Jiang, and J. Vaidya, “Privacy preserving probabilistic record linkage without trusted third party,” in 2018 16th Annual Conference on Privacy, Security and Trust, PST 2018, ser. 2018 16th Annual Conference on Privacy, Security and Trust, PST 2018, R. Deng, S. Marsh, J. Nurse, R. Lu, S. Sezer, P. Miller, L. Chen, K. McLaughlin, and A. Ghorbani, Eds. United States: Institute of Electrical and Electronics Engineers Inc., Oct. 2018, funding Information: This work was supported by grants from UC Anschutz Medical Center, NSF under award no. CNS 1650573, AFRL, CableLabs, Furuno Electric Company, and SecureNok. Publisher Copyright: © 2018 IEEE.; 16th Annual Conference on Privacy, Security and Trust, PST 2018 ; Conference date: 28-08-2018 Through 30-08-2018.
- G. Xu, H. Li, Y. Zhang, S. Xu, J. Ning, and R. H. Deng, “Privacy-preserving federated deep learning with irregular users,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 2, pp. 1364–1381, 2022.
- “Federated learning with homomorphic encryption.” [Online]. Available: https://developer.nvidia.com/blog/federated-learning-with-homomorphic-encryption/
- S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
- J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” Advances in neural information processing systems, vol. 27, 2014.
- M. Wortsman, G. Ilharco, J. W. Kim, M. Li, S. Kornblith, R. Roelofs, R. G. Lopes, H. Hajishirzi, A. Farhadi, H. Namkoong et al., “Robust fine-tuning of zero-shot models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7959–7971.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- Y. Pruksachatkun, J. Phang, H. Liu, P. M. Htut, X. Zhang, R. Y. Pang, C. Vania, K. Kann, and S. R. Bowman, “Intermediate-task transfer learning with pretrained models for natural language understanding: When and why does it work?” 2020.
- S. K. Ainsworth, J. Hayase, and S. Srinivasa, “Git re-basin: Merging models modulo permutation symmetries,” arXiv preprint arXiv:2209.04836, 2022.
- N. Hoang, T. Lam, B. K. H. Low, and P. Jaillet, “Learning task-agnostic embedding of multiple black-box experts for multi-task model fusion,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13–18 Jul 2020, pp. 4282–4292. [Online]. Available: https://proceedings.mlr.press/v119/hoang20b.html
- T. C. Lam, N. Hoang, B. K. H. Low, and P. Jaillet, “Model fusion for personalized learning,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 5948–5958. [Online]. Available: https://proceedings.mlr.press/v139/lam21a.html
- S. P. Singh and M. Jaggi, “Model fusion via optimal transport,” Advances in Neural Information Processing Systems, vol. 33, pp. 22 045–22 055, 2020.
- M. Wortsman, G. Ilharco, S. Y. Gadre, R. Roelofs, R. Gontijo-Lopes, A. S. Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith et al., “Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time,” in International Conference on Machine Learning. PMLR, 2022, pp. 23 965–23 998.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning. PMLR, 2021, pp. 8748–8763.
- H. Pham, Z. Dai, G. Ghiasi, H. Liu, A. W. Yu, M.-T. Luong, M. Tan, and Q. V. Le, “Combined scaling for zero-shot transfer learning,” arXiv preprint arXiv:2111.10050, 2021.
- Z. Wang, Z. Dai, B. Póczos, and J. Carbonell, “Characterizing and avoiding negative transfer,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 11 293–11 302.
- W. Zhang, L. Deng, L. Zhang, and D. Wu, “A survey on negative transfer,” IEEE/CAA Journal of Automatica Sinica, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.