Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DCFL: Non-IID awareness Data Condensation aided Federated Learning (2312.14219v1)

Published 21 Dec 2023 in cs.LG, cs.AI, cs.CR, and cs.DC

Abstract: Federated learning is a decentralized learning paradigm wherein a central server trains a global model iteratively by utilizing clients who possess a certain amount of private datasets. The challenge lies in the fact that the client side private data may not be identically and independently distributed, significantly impacting the accuracy of the global model. Existing methods commonly address the Non-IID challenge by focusing on optimization, client selection and data complement. However, most approaches tend to overlook the perspective of the private data itself due to privacy constraints.Intuitively, statistical distinctions among private data on the client side can help mitigate the Non-IID degree. Besides, the recent advancements in dataset condensation technology have inspired us to investigate its potential applicability in addressing Non-IID issues while maintaining privacy. Motivated by this, we propose DCFL which divides clients into groups by using the Centered Kernel Alignment (CKA) method, then uses dataset condensation methods with non-IID awareness to complete clients. The private data from clients within the same group is complementary and their condensed data is accessible to all clients in the group. Additionally, CKA-guided client selection strategy, filtering mechanisms, and data enhancement techniques are incorporated to efficiently and precisely utilize the condensed data, enhance model performance, and minimize communication time. Experimental results demonstrate that DCFL achieves competitive performance on popular federated learning benchmarks including MNIST, FashionMNIST, SVHN, and CIFAR-10 with existing FL protocol.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics.   PMLR, 2017, pp. 1273–1282.
  2. X. Wang, Y. Han, V. C. Leung, D. Niyato, X. Yan, and X. Chen, “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020.
  3. X.-C. Li, J.-L. Tang, S. Song, B. Li, Y. Li, Y. Shao, L. Gan, and D.-C. Zhan, “Avoid overfitting user specific information in federated keyword spotting,” arXiv preprint arXiv:2206.08864, 2022.
  4. K. Doshi and Y. Yilmaz, “Federated learning-based driver activity recognition for edge devices,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3338–3346.
  5. Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
  6. J. Xu, B. S. Glicksberg, C. Su, P. Walker, J. Bian, and F. Wang, “Federated learning for healthcare informatics,” Journal of Healthcare Informatics Research, vol. 5, pp. 1–19, 2021.
  7. P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
  8. Q. Li, Y. Diao, Q. Chen, and B. He, “Federated learning on non-iid data silos: An experimental study,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE).   IEEE, 2022, pp. 965–978.
  9. K. Hsieh, A. Phanishayee, O. Mutlu, and P. Gibbons, “The non-iid data quagmire of decentralized machine learning,” in International Conference on Machine Learning.   PMLR, 2020, pp. 4387–4398.
  10. Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582, 2018.
  11. Y. LeCun, “The mnist database of handwritten digits,” http://yann. lecun. com/exdb/mnist/, 1998.
  12. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” University of Toronto, 2009.
  13. P. Warden, “Speech commands: A dataset for limited-vocabulary speech recognition,” arXiv preprint arXiv:1804.03209, 2018.
  14. T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, “Federated learning: Challenges, methods, and future directions,” IEEE signal processing magazine, vol. 37, no. 3, pp. 50–60, 2020.
  15. J. Wang, Q. Liu, H. Liang, G. Joshi, and H. V. Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimization,” Advances in neural information processing systems, vol. 33, pp. 7611–7623, 2020.
  16. S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International conference on machine learning.   PMLR, 2020, pp. 5132–5143.
  17. B. Soltani, V. Haghighi, A. Mahmood, Q. Z. Sheng, and L. Yao, “A survey on participant selection for federated learning in mobile networks,” in Proceedings of the 17th ACM Workshop on Mobility in the Evolving Internet Architecture, 2022, pp. 19–24.
  18. J. Ma, X. Sun, W. Xia, X. Wang, X. Chen, and H. Zhu, “Client selection based on label quantity information for federated learning,” in 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).   IEEE, 2021, pp. 1–6.
  19. S. Kornblith, M. Norouzi, H. Lee, and G. Hinton, “Similarity of neural network representations revisited,” in International conference on machine learning.   PMLR, 2019, pp. 3519–3529.
  20. D. Zeng, S. Liang, X. Hu, H. Wang, and Z. Xu, “Fedlab: A flexible federated learning framework.” J. Mach. Learn. Res., vol. 24, pp. 100–1, 2023.
  21. M. Luo, F. Chen, D. Hu, Y. Zhang, J. Liang, and J. Feng, “No fear of heterogeneity: Classifier calibration for federated learning with non-iid data,” Advances in Neural Information Processing Systems, vol. 34, pp. 5972–5984, 2021.
  22. C. Li, X. Zeng, M. Zhang, and Z. Cao, “Pyramidfl: A fine-grained client selection framework for efficient federated learning,” in Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, 2022, pp. 158–171.
  23. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
  24. Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” NIPS, 2011.
  25. T. Wang, J.-Y. Zhu, A. Torralba, and A. A. Efros, “Dataset distillation,” arXiv preprint arXiv:1811.10959, 2018.
  26. S. Lei and D. Tao, “A comprehensive survey to dataset distillation,” arXiv preprint arXiv:2301.05603, 2023.
  27. T. Nguyen, Z. Chen, and J. Lee, “Dataset meta-learning from kernel ridge-regression,” arXiv preprint arXiv:2011.00050, 2020.
  28. G. Cazenavette, T. Wang, A. Torralba, A. A. Efros, and J.-Y. Zhu, “Dataset distillation by matching training trajectories,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4750–4759.
  29. B. Zhao, K. R. Mopuri, and H. Bilen, “Dataset condensation with gradient matching,” arXiv preprint arXiv:2006.05929, 2020.
  30. B. Zhao and H. Bilen, “Dataset condensation with distribution matching,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 6514–6523.
  31. ——, “Dataset condensation with differentiable siamese augmentation,” in International Conference on Machine Learning.   PMLR, 2021, pp. 12 674–12 685.

Summary

We haven't generated a summary for this paper yet.