Federated Learning with Projected Trajectory Regularization (2312.14380v1)
Abstract: Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients, which leads to deteriorated model training performances. Prior works in this line of research mainly focus on utilizing last-step global model parameters/gradients or the linear combinations of the past model parameters/gradients, which do not fully exploit the potential of global information from the model training trajectory. In this paper, we propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data heterogeneity issue, which proposes a unique way to better extract the essential global information from the model training trajectory. Specifically, FedPTR allows local clients or the server to optimize an auxiliary (synthetic) dataset that mimics the learning dynamics of the recent model update and utilizes it to project the next-step model trajectory for local training regularization. We conduct rigorous theoretical analysis for our proposed framework under nonconvex stochastic settings to verify its fast convergence under heterogeneous data distributions. Experiments on various benchmark datasets and non-i.i.d. settings validate the effectiveness of our proposed framework.
- Federated learning based on dynamic regularization. arXiv preprint arXiv:2111.04263.
- Flexible dataset distillation: Learn labels instead of images. arXiv preprint arXiv:2006.08572.
- Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4750–4759.
- Adaptive personalized federated learning. Unpublished.
- Minimizing the accumulated trajectory error to improve dataset distillation. arXiv preprint arXiv:2211.11004.
- Personalized federated learning: A meta-learning approach. Advances in neural information processing systems, 34.
- Feddc: Federated learning with non-iid data via local drift decoupling and correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10112–10121.
- Spyros Gidaris and Nikos Komodakis. 2018. Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4367–4375.
- Filip Hanzely and Peter Richtárik. 2020. Federated learning of a mixture of global and local models. Unpublished.
- Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335.
- Mime: Mimicking centralized stochastic algorithms in federated learning. arXiv preprint arXiv:2008.03606.
- Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pages 5132–5143. PMLR.
- Tighter theory for local sgd on identical and heterogeneous data. In International Conference on Artificial Intelligence and Statistics, pages 4519–4529. PMLR.
- Communication-efficient federated learning with acceleration of global momentum. arXiv preprint arXiv:2201.03172.
- Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.
- Learning multiple layers of features from tiny images.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7):3.
- Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10713–10722.
- Ditto: Fair and robust federated learning through personalization. In International Conference on Machine Learning, pages 6357–6368. PMLR.
- Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429–450.
- Dataset distillation via factorization. arXiv preprint arXiv:2210.16774.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR.
- Generalized federated learning via sharpness aware minimization. In International Conference on Machine Learning, pages 18250–18280. PMLR.
- Adaptive federated optimization. arXiv preprint arXiv:2003.00295.
- Federated multi-task learning. Advances in neural information processing systems, 30.
- Federated learning via decentralized dataset distillation in resource-constrained edge environments. arXiv preprint arXiv:2208.11311.
- Personalized federated learning with moreau envelopes. Advances in Neural Information Processing Systems, 33:21394–21405.
- Federated learning with matched averaging. arXiv preprint arXiv:2002.06440.
- Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623.
- Cafe: Learning to condense dataset by aligning features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12196–12205.
- Dataset distillation. arXiv preprint arXiv:1811.10959.
- Communication-efficient adaptive federated learning. In Proceedings of the 39th International Conference on Machine Learning, pages 22802–22838. PMLR.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
- Feddm: Iterative distribution matching for communication-efficient federated learning. arXiv preprint arXiv:2207.09653.
- Fedcm: Federated learning with client-level momentum. arXiv preprint arXiv:2106.10874.
- Fedpd: A federated learning framework with optimal rates and adaptivity to non-iid data. arXiv preprint arXiv:2005.11418.
- Dataset condensation with differentiable siamese augmentation. In International Conference on Machine Learning, pages 12674–12685. PMLR.
- Dataset condensation with distribution matching. arXiv preprint arXiv:2110.04181.
- Dataset condensation with gradient matching. ICLR, 1(2):3.
- Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999.
- Tiejin Chen (15 papers)
- Yuanpu Cao (11 papers)
- Yujia Wang (29 papers)
- Cho-Jui Hsieh (211 papers)
- Jinghui Chen (50 papers)