Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning (2310.04412v1)

Published 6 Oct 2023 in cs.CV

Abstract: Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices to mitigate the risk of data leakage. While recent studies posit that Vision Transformer (ViT) outperforms Convolutional Neural Networks (CNNs) in addressing data heterogeneity in FL, the specific architectural components that underpin this advantage have yet to be elucidated. In this paper, we systematically investigate the impact of different architectural elements, such as activation functions and normalization layers, on the performance within heterogeneous FL. Through rigorous empirical analyses, we are able to offer the first-of-its-kind general guidance on micro-architecture design principles for heterogeneous FL. Intriguingly, our findings indicate that with strategic architectural modifications, pure CNNs can achieve a level of robustness that either matches or even exceeds that of ViTs when handling heterogeneous data clients in FL. Additionally, our approach is compatible with existing FL techniques and delivers state-of-the-art solutions across a broad spectrum of FL benchmarks. The code is publicly available at https://github.com/UCSC-VLAA/FedConv

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. Are transformers more robust than CNNs? In NeurIPS, 2021.
  2. Understanding robustness of transformers for image classification. arXiv preprint arXiv:2103.14586, 2021.
  3. Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In IJCNN, 2020.
  4. Characterizing signal propagation to close the performance gap in unnormalized resnets. In ICLR, 2021a.
  5. High-performance large-scale image recognition without normalization. In ICML, 2021b.
  6. In-place activated batchnorm for memory-optimized training of dnns. In CVPR, 2018.
  7. Fast and accurate deep network learning by exponential linear units (ELUs). In ICLR, 2016.
  8. Randaugment: Practical data augmentation with no separate search. In NeurIPS, 2020.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2020.
  10. Rethinking normalization methods in federated learning. arXiv preprint arXiv:2210.03277, 2022.
  11. Incorporating second-order functional knowledge for better option pricing. In NeurIPS, 2000.
  12. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107:3–11, 2018.
  13. Levit: a vision transformer in convnet’s clothing for faster inference. In ICCV, 2021.
  14. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In ICCV, 2015.
  15. Deep residual learning for image recognition. In CVPR, 2016.
  16. Masked autoencoders are scalable vision learners. In CVPR, 2022.
  17. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
  18. Searching for mobilenetv3. In ICCV, 2019.
  19. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.
  20. The non-iid data quagmire of decentralized machine learning. In ICML, 2020.
  21. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
  22. Federated visual classification with real-world data distribution. In ECCV, 2020.
  23. Deep networks with stochastic depth. In ECCV, 2016.
  24. Scaffold: Stochastic controlled averaging for federated learning. In ICML, 2020.
  25. Learning multiple layers of features from tiny images. 2009.
  26. Fedscale: Benchmarking model and system performance of federated learning. In ResilientFL, 2021.
  27. Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019.
  28. A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020a.
  29. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020b.
  30. Federated optimization in heterogeneous networks. In MLSys, 2020c.
  31. Fedbn: Federated learning on non-iid features via local batch normalization. arXiv preprint arXiv:2102.07623, 2021.
  32. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, 2021.
  33. A convnet for the 2020s. In CVPR, 2022.
  34. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  35. Decoupled weight decay regularization. In ICLR, 2019.
  36. Communication-efficient learning of deep networks from decentralized data. In AISTATS, 2017.
  37. Vision transformers are robust learners. In AAAI, 2022.
  38. Handling data heterogeneity with generative replay in collaborative learning for medical imaging. Medical Image Analysis, 78:102424, 2022a.
  39. Rethinking architecture design for tackling data heterogeneity in federated learning. In CVPR, 2022b.
  40. Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
  41. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018.
  42. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  43. Evalnorm: Estimating batch normalization statistics for evaluation. In ICCV, 2019.
  44. Inception-v4, inception-resnet and the impact of residual connections on learning. In ICLR Workshop, 2016a.
  45. Rethinking the inception architecture for computer vision. In CVPR, 2016b.
  46. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML, 2019.
  47. Training data-efficient image transformers & distillation through attention. In ICML, 2021a.
  48. Going deeper with image transformers. In ICCV, 2021b.
  49. Patches are all you need? arXiv preprint arXiv:2201.09792, 2022.
  50. CH Van Berkel. Multi-core for mobile phones. In DATE, 2009.
  51. The inaturalist species classification and detection dataset. In CVPR, 2018.
  52. Attention is all you need. In NeurIPS, 2017.
  53. Optimizing federated learning on non-iid data with reinforcement learning. In INFOCOM, 2020a.
  54. Federated learning with matched averaging. In ICLR, 2020b.
  55. Resnet strikes back: An improved training procedure in timm. arXiv preprint arXiv:2110.00476, 2021.
  56. Early convolutions help transformers see better. NeurIPS, 2021.
  57. Smooth adversarial training. arXiv preprint arXiv:2006.14536, 2020.
  58. Aggregated residual transformations for deep neural networks. In CVPR, 2017.
  59. Label-efficient self-supervised federated learning for tackling data heterogeneity in medical imaging. IEEE Transactions on Medical Imaging, 2023.
  60. Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV, 2019.
  61. Adaptive methods for nonconvex optimization. In NeurIPS, 2018.
  62. Delving deep into the generalization of vision transformers under distribution shifts. In CVPR, 2022.
  63. mixup: Beyond empirical risk minimization. In ICLR, 2018.
  64. Federated learning for machinery fault diagnosis with dynamic validation and self-supervision. Knowledge-Based Systems, 213:106679, 2021.
  65. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582, 2018.
  66. Random erasing data augmentation. In AAAI, 2020.
  67. Data-free knowledge distillation for heterogeneous federated learning. In ICML, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.