Federated Data Analytics: A Study on Linear Models (2206.07786v1)
Abstract: As edge devices become increasingly powerful, data analytics are gradually moving from a centralized to a decentralized regime where edge compute resources are exploited to process more of the data locally. This regime of analytics is coined as federated data analytics (FDA). In spite of the recent success stories of FDA, most literature focuses exclusively on deep neural networks. In this work, we take a step back to develop an FDA treatment for one of the most fundamental statistical models: linear regression. Our treatment is built upon hierarchical modeling that allows borrowing strength across multiple groups. To this end, we propose two federated hierarchical model structures that provide a shared representation across devices to facilitate information sharing. Notably, our proposed frameworks are capable of providing uncertainty quantification, variable selection, hypothesis testing and fast adaptation to new unseen data. We validate our methods on a range of real-life applications including condition monitoring for aircraft engines. The results show that our FDA treatment for linear models can serve as a competing benchmark model for future development of federated algorithms.
- Probability and Bayesian modeling. CRC Press.
- Ridge regression and its applications in genetic studies. Plos one, 16(4):e0245376.
- Barthelme, S. (2016). Simon barthelme: The expectation-propagation algorithm: a tutorial - part 1.
- Deep learning-based multiple object visual tracking on embedded system for iot and mobile edge computing applications. IEEE Internet of Things Journal, 6(3):5423–5431.
- CleanTechnica (2021). Tesla fsd hardware has 150 million times more computer power than apollo 11 computer. https://cleantechnica.com/2021/05/24/tesla-fsd-hardware-has-150-million-times-more-computer-power-than-apollo-11-computer/. Accessed: 2021-05-24.
- Using data mining to predict secondary school student performance.
- Adaptive personalized federated learning. arXiv preprint arXiv:2003.13461.
- Fairness-aware agnostic federated learning. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pages 181–189. SIAM.
- Personalized federated learning: A meta-learning approach. arXiv preprint arXiv:2002.07948.
- Communication-efficient accurate statistical estimation. Journal of the American Statistical Association, pages 1–11.
- The role of edge computing in internet of things. IEEE communications magazine, 56(11):110–115.
- Communication-efficient distributed statistical inference. Journal of the American Statistical Association.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210.
- The internet of federated things (ioft). IEEE Access, 9:156071–156113.
- The connected car: Who is in the driver’s seat? a study on privacy and onboard vehicle telematics technology.
- Monitoring serially dependent categorical processes with ordinal information. IISE Transactions, 50(7):596–605.
- Ditto: Fair and robust federated learning through personalization. In International Conference on Machine Learning, pages 6357–6368. PMLR.
- Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60.
- Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429–450.
- Fair resource allocation in federated learning. arXiv preprint arXiv:1905.10497.
- On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189.
- A data-level fusion model for developing composite health indices for degradation modeling and prognostic analysis. IEEE Transactions on Automation Science and Engineering, 10(3):652–664.
- Edge computing for autonomous driving: Opportunities and challenges. Proceedings of the IEEE, 107(8):1697–1716.
- Accelerating federated learning via momentum gradient descent. IEEE Transactions on Parallel and Distributed Systems, 31(8):1754–1766.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR.
- Minka, T. P. (2001). A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology.
- Minka, T. P. (2013). Expectation propagation for approximate bayesian inference. arXiv preprint arXiv:1301.2294.
- Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices. Future Generation Computer Systems, 133:53–67.
- Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350.
- An internet-of-medical-things-enabled edge computing framework for tackling covid-19. IEEE Internet of Things Journal, 8(21):15847–15854.
- Linregdroid: Detection of android malware using multiple linear regression models-based classifiers. IEEE Access, 10:14246–14259.
- Different scaling of linear models and deep learning in ukbiobank brain images versus machine-learning datasets. Nature communications, 11(1):1–15.
- Fed-ensemble: Improving generalization through model ensembling in federated learning. arXiv preprint arXiv:2107.10663.
- Edge computing: Vision and challenges. IEEE internet of things journal, 3(5):637–646.
- A distribution-based functional linear model for reliability analysis of advanced high-strength dual-phase steels by utilizing material microstructure images. IISE Transactions, 49(9):863–873.
- Statistical degradation modeling and prognostics of multiple sensor signals via data fusion: A composite health index approach. IISE Transactions, 50(10):853–867.
- Stich, S. U. (2018). Local sgd converges fast and communicates little. arXiv preprint arXiv:1805.09767.
- Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems.
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.
- Federated nonconvex sparse learning. arXiv preprint arXiv:2101.00052.
- Shrinkage priors for bayesian penalized regression. Journal of Mathematical Psychology, 89:31–50.
- Vanschoren, J. (2019). Meta-learning. In Automated Machine Learning, pages 35–61. Springer, Cham.
- Expectation propagation as a way of life: A framework for bayesian inference on partitioned data. J. Mach. Learn. Res., 21(17):1–53.
- Cooperative sgd: A unified framework for the design and analysis of local-update sgd algorithms. Journal of Machine Learning Research, 22(213):1–50.
- Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA.
- A fairness-aware incentive scheme for federated learning. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 393–399.
- Federated accelerated stochastic gradient descent. Advances in Neural Information Processing Systems, 33:5332–5344.
- Federated composite optimization. In International Conference on Machine Learning, pages 12253–12266. PMLR.
- Federated gaussian process: Convergence, automatic personalization and multi-fidelity modeling. arXiv preprint arXiv:2111.14008.
- Gifair-fl: An approach for group and individual fairness in federated learning. arXiv preprint arXiv:2108.02741.
- A convex formulation for learning task relationships in multi-task learning. arXiv preprint arXiv:1203.3536.
- Green ai for iiot: Energy efficient intelligent edge computing for industrial internet of things. IEEE Transactions on Green Communications and Networking.