Towards more sustainable enterprise data and application management with cross silo Federated Learning and Analytics (2312.14628v1)
Abstract: To comply with new legal requirements and policies committed to privacy protection, more and more companies start to deploy cross-silo Federated Learning at global scale, where several clients/silos collaboratively train a global model under the coordination of a central server. Instead of data sharing and transmission, clients train models using their private local data and exchange model updates. However, there is little understanding of the carbon emission impact of cross silo Federated Learning due to the lack of related works. In this study, we first analyze the sustainability aspect of cross-silo Federated Learning, across the AI product life cycle instead of focusing only on the model training, with the comparison to the centralized method. A more holistic quantitative cost and CO2 emission estimation method for real world cross-silo Federated Learning setting is proposed. Secondly, we propose a novel data and application management system using cross silo Federated Learning and analytics to make IT companies more sustainable and cost effective.
- L. Belkhir and A. Elmeligi, “Assessing ict global emissions footprint: Trends to 2040 & recommendations,” Journal of cleaner production, vol. 177, pp. 448–463, 2018.
- C. Freitag, M. Berners-Lee, K. Widdicks, B. Knowles, G. S. Blair, and A. Friday, “The real climate and transformative impact of ict: A critique of estimates, trends, and regulations,” Patterns, vol. 2, no. 9, 2021.
- J. Brennan, “The true cost of redundant data - waterford technologies,” Oct 2022.
- A. Kumar and T. Davenport, “How to make generative ai greener,” 2023.
- R. CHO, “Ai’s growing carbon footprint,” 2023.
- Q. Li, Y. Diao, Q. Chen, and B. He, “Federated learning on non-iid data silos: An experimental study,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 965–978, IEEE, 2022.
- Q. Zeng, Y. Du, K. Huang, and K. K. Leung, “Energy-efficient radio resource allocation for federated edge learning,” in 2020 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 1–6, IEEE, 2020.
- B. Güler and A. Yener, “A framework for sustainable federated learning,” in 2021 19th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), pp. 1–8, IEEE, 2021.
- X. Qiu, T. Parcollet, J. Fernandez-Marques, P. P. Gusmao, Y. Gao, D. J. Beutel, T. Topal, A. Mathur, and N. D. Lane, “A first look into the carbon footprint of federated learning,” Journal of Machine Learning Research, vol. 24, no. 129, pp. 1–23, 2023.
- D. Ramage and S. Mazzocchi, “Federated analytics: Collaborative data science without data collection,” 2020.
- A. R. Elkordy, Y. H. Ezzeldin, S. Han, S. Sharma, C. He, S. Mehrotra, S. Avestimehr, et al., “Federated analytics: A survey,” APSIPA Transactions on Signal and Information Processing, vol. 12, no. 1, 2023.
- S. Savazzi, V. Rampa, S. Kianoush, and M. Bennis, “An energy and carbon footprint analysis of distributed and federated learning,” IEEE Transactions on Green Communications and Networking, vol. 7, no. 1, pp. 248–264, 2022.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics, pp. 1273–1282, PMLR, 2017.
- Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
- J. Dodge, T. Prewitt, R. Tachet des Combes, E. Odmark, R. Schwartz, E. Strubell, A. S. Luccioni, N. A. Smith, N. DeCario, and W. Buchanan, “Measuring the carbon intensity of ai in cloud instances,” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 1877–1894, 2022.
- “https://github.com/green-software-foundation/sci-guide.”
- “https://github.com/etsy/cloud-jewels.”
- “The carbon benefits of cloud computing a study on the microsoft cloud in partnership with wsp.”
- J. Aslan, K. Mayers, J. G. Koomey, and C. France, “Electricity intensity of internet data transmission: Untangling the estimates,” Journal of industrial ecology, vol. 22, no. 4, pp. 785–798, 2018.
- “https://www.cloudcarbonfootprint.org.”
- “https://www.eea.europa.eu/data-and-maps/daviz.”
- “https://devblogs.microsoft.com/sustainable-software.”
- H. Cao, S. Bernard, L. Heutte, and R. Sabourin, “Dissimilarity-based representation for radiomics applications,” arXiv preprint arXiv:1803.04460, 2018.
- H. Cao, S. Bernard, L. Heutte, and R. Sabourin, “Improve the performance of transfer learning without fine-tuning using dissimilarity-based multi-view learning for breast cancer histology images,” in Image Analysis and Recognition: 15th International Conference, ICIAR 2018, Póvoa de Varzim, Portugal, June 27–29, 2018, Proceedings 15, pp. 779–787, Springer, 2018.
- H. Cao, S. Bernard, R. Sabourin, and L. Heutte, “Random forest dissimilarity based multi-view learning for radiomics application,” Pattern Recognition, vol. 88, pp. 185–197, 2019.
- H. Cao, I. El Baamrani, and E. Thomas, “Multi-view user representation learning for user matching without personal information,” in 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, 2023.
- H. Cao, S. Bernard, L. Heutte, and R. Sabourin, “Dynamic voting in multi-view learning for radiomics applications,” in Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshop, S+ SSPR 2018, Beijing, China, August 17–19, 2018, Proceedings 9, pp. 32–41, Springer, 2018.
- H. Cao, S. Bernard, R. Sabourin, and L. Heutte, “A novel random forest dissimilarity measure for multi-view learning,” in 2020 25th International Conference on Pattern Recognition (ICPR), pp. 1344–1351, IEEE, 2021.