Decoupled Vertical Federated Learning for Practical Training on Vertically Partitioned Data (2403.03871v2)
Abstract: Vertical Federated Learning (VFL) is an emergent distributed machine learning paradigm for collaborative learning between clients who have disjoint features of common entities. However, standard VFL lacks fault tolerance, with each participant and connection being a single point of failure. Prior attempts to induce fault tolerance in VFL focus on the scenario of "straggling clients", usually entailing that all messages eventually arrive or that there is an upper bound on the number of late messages. To handle the more general problem of arbitrary crashes, we propose Decoupled VFL (DVFL). To handle training with faults, DVFL decouples training between communication rounds using local unsupervised objectives. By further decoupling label supervision from aggregation, DVFL also enables redundant aggregators. As secondary benefits, DVFL can enhance data efficiency and provides immunity against gradient-based attacks. In this work, we implement DVFL for split neural networks with a self-supervised autoencoder loss. When there are faults, DVFL outperforms the best VFL-based alternative (97.58% vs 96.95% on an MNIST task). Even under perfect conditions, performance is comparable.
- Unsupervised neural network learning procedures for feature extraction and classification. Applied Intelligence, 6(3):185–203, July 1996. ISSN 1573-7497. doi: 10.1007/bf00126625. URL http://dx.doi.org/10.1007/BF00126625.
- Greedy layerwise learning can scale to ImageNet. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 583–593. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/belilovsky19a.html.
- Decoupled greedy learning of CNNs. In III, H. D. and Singh, A. (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp. 736–745. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/belilovsky20a.html.
- Byzantine fault tolerance in distributed machine learning : a survey, 2022.
- Breiman, L. Bagging predictors. Machine Learning, 24(2):123–140, August 1996. doi: 10.1007/bf00058655. URL https://doi.org/10.1007/bf00058655.
- Unsupervised learning of visual features by contrasting cluster assignments. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 9912–9924. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/70feb62b69f16e0238f741fab228fec2-Paper.pdf.
- Splitnn-driven vertical partitioning, 2020.
- Vafl: a method of vertical asynchronous federated learning, 2020a.
- A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020b.
- Unsplit: Data-oblivious model inversion, model stealing, and label inference attacks against split learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society, CCS ’22. ACM, November 2022. doi: 10.1145/3559613.3563201. URL http://dx.doi.org/10.1145/3559613.3563201.
- Multi-participant multi-class vertical federated learning, 2020.
- Semi-supervised federated heterogeneous transfer learning. Know.-Based Syst., 252(C), sep 2022. ISSN 0950-7051. doi: 10.1016/j.knosys.2022.109384. URL https://doi.org/10.1016/j.knosys.2022.109384.
- Efficient private matching and set intersection. In Advances in Cryptology - EUROCRYPT 2004, pp. 1–19. Springer Berlin Heidelberg, 2004. doi: 10.1007/978-3-540-24676-3˙1. URL https://doi.org/10.1007/978-3-540-24676-3_1.
- Label inference attacks against vertical federated learning. In 31st USENIX Security Symposium (USENIX Security 22), pp. 1397–1414, Boston, MA, August 2022. USENIX Association. ISBN 978-1-939133-31-1. URL https://www.usenix.org/conference/usenixsecurity22/presentation/fu-chong.
- Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications, 116:1–8, August 2018. doi: 10.1016/j.jnca.2018.05.003. URL https://doi.org/10.1016/j.jnca.2018.05.003.
- Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption, 2017.
- Fdml: A collaborative machine learning framework for distributed features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19. ACM, July 2019. doi: 10.1145/3292500.3330765. URL http://dx.doi.org/10.1145/3292500.3330765.
- Intrator, N. Feature extraction using an unsupervised neural network. Neural Computation, 4(1):98–107, January 1992. ISSN 1530-888X. doi: 10.1162/neco.1992.4.1.98. URL http://dx.doi.org/10.1162/neco.1992.4.1.98.
- Decoupled neural interfaces using synthetic gradients. In Precup, D. and Teh, Y. W. (eds.), Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 1627–1635. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/jaderberg17a.html.
- VF-PS: How to select important participants in vertical federated learning, efficiently and securely? In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=vNrSXIFJ9wz.
- Hebbian deep learning without feedback. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=8gd4M-_Rj1.
- Vertical federated learning: A structured literature review, 2023.
- Caltech 101, Apr 2022.
- Fedvs: Straggler-resilient and privacy-preserving vertical federated learning for split models. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023.
- Liu, S. A survey on fault-tolerance in distributed optimization and machine learning, 2021.
- Approximate byzantine fault-tolerance in distributed optimization. In Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing. ACM, July 2021. doi: 10.1145/3465084.3467902. URL https://doi.org/10.1145/3465084.3467902.
- A secure federated transfer learning framework. IEEE Intelligent Systems, 35(4):70–82, July 2020. doi: 10.1109/mis.2020.2988525. URL https://doi.org/10.1109/mis.2020.2988525.
- Vertical federated learning: Concepts, advances and challenges, 2023.
- An algorithm for training polynomial networks, 2014a.
- On the computational efficiency of training neural networks. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., and Weinberger, K. (eds.), Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014b. URL https://proceedings.neurips.cc/paper_files/paper/2014/file/3a0772443a0739141292a5429b952fe6-Paper.pdf.
- Feature inference attack on model predictions in vertical federated learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, April 2021. doi: 10.1109/icde51399.2021.00023. URL https://doi.org/10.1109/icde51399.2021.00023.
- The uci machine learning repository. URL https://archive.ics.uci.edu.
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In Singh, A. and Zhu, J. (eds.), Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pp. 1273–1282. PMLR, 20–22 Apr 2017. URL https://proceedings.mlr.press/v54/mcmahan17a.html.
- Entity resolution and federated learning get a federated resolution, 2018.
- Pyvertical: A vertical federated learning framework for multi-headed splitnn, 2021. URL https://arxiv.org/abs/2104.00489.
- A comparative analysis of speech signal processing algorithms for parkinson’s disease classification and the use of the tunable q-factor wavelet transform. Appl. Soft Comput., 74:255–263, 2019. URL https://api.semanticscholar.org/CorpusID:57374324.
- Efficient asynchronous multi-participant vertical federated learning. IEEE Transactions on Big Data, pp. 1–12, 2022. ISSN 2372-2096. doi: 10.1109/tbdata.2022.3201729. URL http://dx.doi.org/10.1109/TBDATA.2022.3201729.
- Blockwise self-supervised learning with barlow twins, 2023. URL https://openreview.net/forum?id=uXeEBgzILe5.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014. URL http://jmlr.org/papers/v15/srivastava14a.html.
- Vertical federated learning without revealing intersection membership, 2021.
- Communication-efficient vertical federated learning with limited overlapping samples. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5203–5212, October 2023.
- Representation learning with contrastive predictive coding, 2019.
- Revisiting locally supervised learning: an alternative to end-to-end training. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=fAbkE6ant2.
- Practical vertical federated learning with unsupervised representation learning. IEEE Transactions on Big Data, pp. 1–1, 2022. doi: 10.1109/tbdata.2022.3180117. URL https://doi.org/10.1109/tbdata.2022.3180117.
- A survey on vertical federated learning: From a layered perspective, 2023.
- The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2, Part 1):2473–2480, 2009. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2007.12.020. URL https://www.sciencedirect.com/science/article/pii/S0957417407006719.
- Barlow twins: Self-supervised learning via redundancy reduction. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 12310–12320. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/zbontar21a.html.