Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 189 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers (2312.12835v2)

Published 20 Dec 2023 in cs.LG and cs.DC

Abstract: Byzantine machine learning has garnered considerable attention in light of the unpredictable faults that can occur in large-scale distributed learning systems. The key to secure resilience against Byzantine machines in distributed learning is resilient aggregation mechanisms. Although abundant resilient aggregation rules have been proposed, they are designed in ad-hoc manners, imposing extra barriers on comparing, analyzing, and improving the rules across performance criteria. This paper studies near-optimal aggregation rules using clustering in the presence of outliers. Our outlier-robust clustering approach utilizes geometric properties of the update vectors provided by workers. Our analysis show that constant approximations to the 1-center and 1-mean clustering problems with outliers provide near-optimal resilient aggregators for metric-based criteria, which have been proven to be crucial in the homogeneous and heterogeneous cases respectively. In addition, we discuss two contradicting types of attacks under which no single aggregation rule is guaranteed to improve upon the naive average. Based on the discussion, we propose a two-phase resilient aggregation framework. We run experiments for image classification using a non-convex loss function. The proposed algorithms outperform previously known aggregation rules by a large margin with both homogeneous and heterogeneous data distributions among non-faulty workers. Code and appendix are available at https://github.com/jerry907/AAAI24-RASHB.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Clustering What Matters: Optimal Approximation for Clustering with Outliers. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’23), 37(6): 6666–6674.
  2. Byzantine-Resilient Non-Convex Stochastic Gradient Descent. In International Conference on Learning Representations (ICLR’21).
  3. Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (AISTATS’23), volume 206 of Proceedings of Machine Learning Research, 1232–1300. PMLR.
  4. On the Privacy-Robustness-Utility Trilemma in Distributed Learning. In Proceedings of the 40th International Conference on Machine Learning (ICML’23), volume 202 of Proceedings of Machine Learning Research. PMLR.
  5. Min-Sum Clustering (With Outliers). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, (APPROX/RANDOM’21), volume 207 of LIPIcs, 16:1–16:16. Seattle, Washington, USA (Virtual Conference): Schloss Dagstuhl - Leibniz-Zentrum für Informatik.
  6. A Little Is Enough: Circumventing Defenses For Distributed Learning. In Advances in Neural Information Processing Systems (NeurIPS’19), volume 32. Curran Associates, Inc.
  7. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent. In Advances in Neural Information Processing Systems (NeurIPS’17), volume 30. Curran Associates, Inc.
  8. Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent. Proc. ACM Meas. Anal. Comput. Syst., 1(2).
  9. EMNIST: Extending MNIST to handwritten letters. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN’17), 2921–2926.
  10. Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data. In Proceedings of the 38th International Conference on Machine Learning (ICML’21), volume 139 of Proceedings of Machine Learning Research, 2478–2488. PMLR.
  11. Ding, H. 2020. A Sub-Linear Time Framework for Geometric Optimization with Outliers in High Dimensions. In 28th Annual European Symposium on Algorithms (ESA’20), volume 173 of Leibniz International Proceedings in Informatics (LIPIcs), 38:1–38:21. Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum für Informatik.
  12. The Hidden Vulnerability of Distributed Learning in Byzantium. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), volume 80 of Proceedings of Machine Learning Research, 3521–3530. PMLR.
  13. Byzantine Machine Learning Made Easy By Resilient Averaging of Momentums. In Proceedings of the 39th International Conference on Machine Learning (ICML’22), volume 162 of Proceedings of Machine Learning Research, 6246–6283. PMLR.
  14. Approximation Schemes for Clustering with Outliers. ACM Transactions on Algorithms, 15(2).
  15. Robust federated learning in a heterogeneous environment. arXiv:1906.06629.
  16. Har-Peled, S. 2011. Geometric approximation algorithms. 173. American Mathematical Soc.
  17. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335.
  18. Jung, H. 1901. Ueber die kleinste Kugel, die eine räumliche Figur einschliesst. Journal für die reine und angewandte Mathematik (Crelles Journal), 1901(123): 241–257.
  19. Advances and Open Problems in Federated Learning. Foundations and Trends® in Machine Learning, 14(1–2): 1–210.
  20. Learning from History for Byzantine Robust Optimization. In Proceedings of the 38th International Conference on Machine Learning (ICML’21), volume 139 of Proceedings of Machine Learning Research, 5311–5319. PMLR.
  21. Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing. In International Conference on Learning Representations (ICLR’22).
  22. Krizhevsky, A. 2009. Learning Multiple Layers of Features from Tiny Images.
  23. The Byzantine Generals Problem, 203–226. New York, NY, USA: Association for Computing Machinery. ISBN 9781450372701.
  24. RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’19), 33(01): 1544–1551.
  25. Approximate Byzantine fault-tolerance in distributed optimization. In Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing (PODC’21), 379–389. ACM.
  26. Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM’08), 165–178. Berlin, Heidelberg: Springer Berlin Heidelberg.
  27. Narayanan, S. 2018. Deterministic O(1)-Approximation Algorithms to 1-Center Clustering with Outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM’18), volume 116 of Leibniz International Proceedings in Informatics (LIPIcs), 21:1–21:19. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
  28. Robust Aggregation for Federated Learning. IEEE Transactions on Signal Processing, 70: 1142–1154.
  29. On the Byzantine Robustness of Clustered Federated Learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20), 8861–8865. IEEE.
  30. Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on Production Federated Learning. In Proc. IEEE Symposium on Security and Privacy (SP’22), 1354–1371. IEEE.
  31. Shenmaier, V. 2013. Complexity and approximation of the smallest k-enclosing ball problem. In Proceedings of the 7th European Conference on Combinatorics, Graph Theory and Applications (Eurocomb’13), 583–588. Scuola Normale Superiore.
  32. PRIOR: Personalized Prior for Reactivating the Information Overlooked in Federated Learning. In Advances in Neural Information Processing Systems (NeurIPS’23).
  33. Smallwood, R. D. 1965. Minimax Detection Station Placement. Operations Research, 13(4): 632–646.
  34. Swarm learning for decentralized and confidential clinical machine learning. Nature, 594(7862): 265–270.
  35. Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation. In Adams, R. P.; and Gogate, V., eds., Proceedings of The 35th Uncertainty in Artificial Intelligence Conference (UAI’20), volume 115 of Proceedings of Machine Learning Research, 261–270. PMLR.
  36. Yildirim, E. A. 2008. Two Algorithms for the Minimum Enclosing Ball Problem. SIAM Journal on Optimization, 19(3): 1368–1391.
  37. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), volume 80 of Proceedings of Machine Learning Research, 5650–5659. PMLR.
  38. Byzantine-Robust Federated Learning with Optimal Statistical Rates. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (AISTATS’23), volume 206 of Proceedings of Machine Learning Research, 3151–3178. PMLR.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: