Energy-Aware Heterogeneous Federated Learning via Approximate DNN Accelerators (2402.18569v2)
Abstract: In Federated Learning (FL), devices that participate in the training usually have heterogeneous resources, i.e., energy availability. In current deployments of FL, devices that do not fulfill certain hardware requirements are often dropped from the collaborative training. However, dropping devices in FL can degrade training accuracy and introduce bias or unfairness. Several works have tackled this problem on an algorithm level, e.g., by letting constrained devices train a subset of the server neural network (NN) model. However, it has been observed that these techniques are not effective w.r.t. accuracy. Importantly, they make simplistic assumptions about devices' resources via indirect metrics such as multiply accumulate (MAC) operations or peak memory requirements. We observe that memory access costs (that are currently not considered in simplistic metrics) have a significant impact on the energy consumption. In this work, for the first time, we consider on-device accelerator design for FL with heterogeneous devices. We utilize compressed arithmetic formats and approximate computing, targeting to satisfy limited energy budgets. Using a hardware-aware energy model, we observe that, contrary to the state of the art's moderate energy reduction, our technique allows for lowering the energy requirements (by 4x) while maintaining higher accuracy.
- FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction. In Advances in Neural Information Processing Systems.
- Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey. ACM Computing Surve 55, 4 (2022).
- Expanding the reach of federated learning by reducing client resource requirements. arXiv:1812.07210 (2018).
- Leaf: A benchmark for federated settings. arXiv 2018. arXiv:1812.01097 (2019).
- Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits 52, 1 (2017).
- Does Federated Dropout Actually Work?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3387–3395.
- HeteroFL: Computation and communication efficient federated learning for heterogeneous clients. In International Conference on Learning Representations (ICLR).
- Performance Analysis of DNN Inference/Training with Convolution and non-Convolution Operations. arXiv:2306.16767 (2023).
- ApproxTrain: Fast Simulation of Approximate Multipliers for DNN Training and Inference. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2023), 1–1.
- Deep Residual Learning for Image Recognition.
- FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout. arXiv:2102.13451 (2021).
- Measuring the effects of non-identical data distribution for federated visual classification. arXiv:1909.06335 (2019).
- Learning multiple layers of features from tiny images.
- Towards fair federated recommendation learning: Characterizing the inter-dependence of system and data heterogeneity. In Proceedings of the 16th ACM Conference on Recommender Systems.
- Communication-efficient learning of deep networks from decentralized data. In 20th International Conference on Artificial Intelligence and Statistics.
- Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
- CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization. Transactions on Machine Learning Research (2023).
- Federated Learning for Computationally Constrained Heterogeneous Devices: A Survey. Comput. Surveys 55, 14s (2023).
- ZeroFL: Efficient on-device training for federated learning with local sparsity. In International Conference on Learning Representations (ICLR).
- Minimally biased multipliers for approximate integer and floating-point multiplication. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018).
- Federated learning over wireless networks: Optimization model design and analysis. In IEEE Conference on Computer Communications.
- Applied Federated Learning: Improving Google Keyboard Query Suggestions. arXiv:1812.02903 (2018).
- FedHM: Efficient Federated Learning for Heterogeneous Models via Low-rank Factorization. arXiv:2111.14655 (2021).
- Kilian Pfeiffer (9 papers)
- Konstantinos Balaskas (10 papers)
- Kostas Siozios (9 papers)
- Jörg Henkel (44 papers)