Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Sybil Resilience in Decentralized Learning (2306.15044v1)

Published 26 Jun 2023 in cs.DC, cs.CR, and cs.LG

Abstract: Federated learning is a privacy-enforcing machine learning technology but suffers from limited scalability. This limitation mostly originates from the internet connection and memory capacity of the central parameter server, and the complexity of the model aggregation function. Decentralized learning has recently been emerging as a promising alternative to federated learning. This novel technology eliminates the need for a central parameter server by decentralizing the model aggregation across all participating nodes. Numerous studies have been conducted on improving the resilience of federated learning against poisoning and Sybil attacks, whereas the resilience of decentralized learning remains largely unstudied. This research gap serves as the main motivator for this study, in which our objective is to improve the Sybil poisoning resilience of decentralized learning. We present SybilWall, an innovative algorithm focused on increasing the resilience of decentralized learning against targeted Sybil poisoning attacks. By combining a Sybil-resistant aggregation function based on similarity between Sybils with a novel probabilistic gossiping mechanism, we establish a new benchmark for scalable, Sybil-resilient decentralized learning. A comprehensive empirical evaluation demonstrated that SybilWall outperforms existing state-of-the-art solutions designed for federated learning scenarios and is the only algorithm to obtain consistent accuracy over a range of adversarial attack scenarios. We also found SybilWall to diminish the utility of creating many Sybils, as our evaluations demonstrate a higher success rate among adversaries employing fewer Sybils. Finally, we suggest a number of possible improvements to SybilWall and highlight promising future research directions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (81)
  1. E. V. Polyakov, M. S. Mazhanov, A. Y. Rolich, L. S. Voskov, M. V. Kachalova, and S. V. Polyakov, “Investigation and development of the intelligent voice assistant for the internet of things using machine learning,” in 2018 Moscow Workshop on Electronic and Networking Technologies (MWENT), 2018, pp. 1–5.
  2. S. A. Salloum, M. Alshurideh, A. Elnagar, and K. Shaalan, “Machine learning and deep learning techniques for cybersecurity: A review,” in Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), A.-E. Hassanien, A. T. Azar, T. Gaber, D. Oliva, and F. M. Tolba, Eds.   Cham: Springer International Publishing, 2020, pp. 50–57.
  3. B. T.K., C. S. R. Annavarapu, and A. Bablani, “Machine learning algorithms for social media analysis: A survey,” Computer Science Review, vol. 40, p. 100395, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1574013721000356
  4. X. Wang and Y. Wang, “Improving content-based and hybrid music recommendation using deep learning,” in Proceedings of the 22nd ACM International Conference on Multimedia, ser. MM ’14.   New York, NY, USA: Association for Computing Machinery, 2014, p. 627–636. [Online]. Available: https://doi.org/10.1145/2647868.2654940
  5. J. Prusa, T. M. Khoshgoftaar, and N. Seliya, “The effect of dataset size on training tweet sentiment classifiers,” in 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), 2015, pp. 96–102.
  6. J. Hestness, S. Narang, N. Ardalani, G. F. Diamos, H. Jun, H. Kianinejad, M. M. A. Patwary, Y. Yang, and Y. Zhou, “Deep learning scaling is predictable, empirically,” CoRR, vol. abs/1712.00409, 2017. [Online]. Available: http://arxiv.org/abs/1712.00409
  7. R. Shao, H. He, H. Liu, and D. Liu, “Stochastic channel-based federated learning for medical data privacy preserving,” 2019.
  8. A. Goldsteen, G. Ezov, R. Shmelkin, M. Moffie, and A. Farkash, “Data minimization for gdpr compliance in machine learning models,” AI and Ethics, pp. 1–15, 2021.
  9. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54.   PMLR, 20–22 Apr 2017, pp. 1273–1282. [Online]. Available: https://proceedings.mlr.press/v54/mcmahan17a.html
  10. J. Janai, F. Güney, A. Behl, A. Geiger et al., “Computer vision for autonomous vehicles: Problems, datasets and state of the art,” Foundations and Trends® in Computer Graphics and Vision, vol. 12, no. 1–3, pp. 1–308, 2020.
  11. P. Navarro, C. Fernández, R. Borraz, and D. Alonso, “A machine learning approach to pedestrian detection for autonomous vehicles using high-definition 3d range data,” Sensors, vol. 17, no. 12, p. 18, Dec 2016. [Online]. Available: http://dx.doi.org/10.3390/s17010018
  12. A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,” CoRR, vol. abs/1811.03604, 2018. [Online]. Available: http://arxiv.org/abs/1811.03604
  13. T. Yang, G. Andrew, H. Eichner, H. Sun, W. Li, N. Kong, D. Ramage, and F. Beaufays, “Applied federated learning: Improving google keyboard query suggestions,” CoRR, vol. abs/1812.02903, 2018. [Online]. Available: http://arxiv.org/abs/1812.02903
  14. M. Chen, R. Mathews, T. Ouyang, and F. Beaufays, “Federated learning of out-of-vocabulary words,” CoRR, vol. abs/1903.10635, 2019. [Online]. Available: http://arxiv.org/abs/1903.10635
  15. Y. Cheng, Y. Liu, T. Chen, and Q. Yang, “Federated learning for privacy-preserving ai,” Communications of the ACM, vol. 63, no. 12, pp. 33–36, 2020.
  16. L. Lyu and C. Chen, “A novel attribute reconstruction attack in federated learning,” CoRR, vol. abs/2108.06910, 2021. [Online]. Available: https://arxiv.org/abs/2108.06910
  17. H. Yang, M. Ge, K. Xiang, and J. Li, “Using highly compressed gradients in federated learning for data reconstruction attacks,” IEEE Transactions on Information Forensics and Security, vol. 18, pp. 818–830, 2023.
  18. H. S. Sikandar, H. Waheed, S. Tahir, S. U. R. Malik, and W. Rafique, “A detailed survey on federated learning attacks and defenses,” Electronics, vol. 12, no. 2, 2023. [Online]. Available: https://www.mdpi.com/2079-9292/12/2/260
  19. P. Qiu, X. Zhang, S. Ji, Y. Pu, and T. Wang, “All you need is hashing: Defending against data reconstruction attack in vertical federated learning,” 2022. [Online]. Available: https://arxiv.org/abs/2212.00325
  20. J. Hamer, M. Mohri, and A. T. Suresh, “FedBoost: A communication-efficient algorithm for federated learning,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119.   PMLR, 13–18 Jul 2020, pp. 3973–3983. [Online]. Available: https://proceedings.mlr.press/v119/hamer20a.html
  21. S. Kadhe, N. Rajaraman, O. O. Koyluoglu, and K. Ramchandran, “Fastsecagg: Scalable secure aggregation for privacy-preserving federated learning,” CoRR, vol. abs/2009.11248, 2020. [Online]. Available: https://arxiv.org/abs/2009.11248
  22. Y. Qi, M. S. Hossain, J. Nie, and X. Li, “Privacy-preserving blockchain-based federated learning for traffic flow prediction,” Future Generation Computer Systems, vol. 117, pp. 328–337, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X2033065X
  23. J. Hou, F. Wang, C. Wei, H. Huang, Y. Hu, and N. Gui, “Credibility assessment based byzantine-resilient decentralized learning,” IEEE Transactions on Dependable and Secure Computing, pp. 1–12, 2022.
  24. C. Hu, J. Jiang, and Z. Wang, “Decentralized federated learning: A segmented gossip approach,” CoRR, vol. abs/1908.07782, 2019. [Online]. Available: http://arxiv.org/abs/1908.07782
  25. I. Hegedűs, G. Danner, and M. Jelasity, “Decentralized learning works: An empirical comparison of gossip learning and federated learning,” Journal of Parallel and Distributed Computing, vol. 148, pp. 109–124, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0743731520303890
  26. Z. Tang, S. Shi, B. Li, and X. Chu, “Gossipfl: A decentralized federated learning framework with sparsified and adaptive communication,” IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 3, pp. 909–922, 2023.
  27. M. de Vos, A. Dhasade, A.-M. Kermarrec, E. Lavoie, and J. Pouwelse, “Modest: Bridging the gap between federated and decentralized learning with decentralized sampling,” 2023.
  28. V. Tolpegin, S. Truex, M. E. Gursoy, and L. Liu, “Data poisoning attacks against federated learning systems,” in Computer Security – ESORICS 2020, L. Chen, N. Li, K. Liang, and S. Schneider, Eds.   Cham: Springer International Publishing, 2020, pp. 480–501.
  29. J. R. Douceur, “The sybil attack,” in Peer-to-Peer Systems, P. Druschel, F. Kaashoek, and A. Rowstron, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2002, pp. 251–260.
  30. C. Fung, C. J. M. Yoon, and I. Beschastnikh, “Mitigating sybils in federated learning poisoning,” CoRR, vol. abs/1808.04866, 2018. [Online]. Available: http://arxiv.org/abs/1808.04866
  31. I. Hegedűs, G. Danner, and M. Jelasity, “Gossip learning as a decentralized alternative to federated learning,” in Distributed Applications and Interoperable Systems, J. Pereira and L. Ricci, Eds.   Cham: Springer International Publishing, 2019, pp. 74–90.
  32. A. G. Roy, S. Siddiqui, S. Pölsterl, N. Navab, and C. Wachinger, “Braintorrent: A peer-to-peer environment for decentralized federated learning,” CoRR, vol. abs/1905.06731, 2019. [Online]. Available: http://arxiv.org/abs/1905.06731
  33. N. M. Jebreel, J. Domingo-Ferrer, D. Sánchez, and A. Blanco-Justicia, “Defending against the label-flipping attack in federated learning,” 2022. [Online]. Available: https://arxiv.org/abs/2207.01982
  34. D. Li, W. E. Wong, W. Wang, Y. Yao, and M. Chau, “Detection and mitigation of label-flipping attacks in federated learning systems with kpca and k-means,” in 2021 8th International Conference on Dependable Systems and Their Applications (DSA), 2021, pp. 551–559.
  35. E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov, “How to backdoor federated learning,” in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, S. Chiappa and R. Calandra, Eds., vol. 108.   PMLR, 26–28 Aug 2020, pp. 2938–2948. [Online]. Available: https://proceedings.mlr.press/v108/bagdasaryan20a.html
  36. Z. Sun, P. Kairouz, A. T. Suresh, and H. B. McMahan, “Can you really backdoor federated learning?” CoRR, vol. abs/1911.07963, 2019. [Online]. Available: http://arxiv.org/abs/1911.07963
  37. C. Wu, X. Yang, S. Zhu, and P. Mitra, “Mitigating backdoor attacks in federated learning,” CoRR, vol. abs/2011.01767, 2020. [Online]. Available: https://arxiv.org/abs/2011.01767
  38. ——, “Mitigating backdoor attacks in federated learning,” CoRR, vol. abs/2011.01767, 2020. [Online]. Available: https://arxiv.org/abs/2011.01767
  39. B. N. Levine, C. Shields, and N. B. Margolin, “A survey of solutions to the sybil attack,” University of Massachusetts Amherst, Amherst, MA, vol. 7, p. 224, 2006.
  40. D. N. Tran, B. Min, J. Li, and L. Subramanian, “Sybil-resilient online content voting.” in NSDI, vol. 9, no. 1, 2009, pp. 15–28.
  41. H. Rowaihy, W. Enck, P. McDaniel, and T. La Porta, “Limiting sybil attacks in structured p2p networks,” in IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications, 2007, pp. 2596–2600.
  42. Y. Xie, F. Yu, Q. Ke, M. Abadi, E. Gillum, K. Vitaldevaria, J. Walter, J. Huang, and Z. M. Mao, “Innocent by association: Early recognition of legitimate users,” in Proceedings of the 2012 ACM Conference on Computer and Communications Security, ser. CCS ’12.   New York, NY, USA: Association for Computing Machinery, 2012, p. 353–364. [Online]. Available: https://doi.org/10.1145/2382196.2382235
  43. F. Lesueur, L. Mé, and V. V. T. Tong, “A sybil-resistant admission control coupling sybilguard with distributed certification,” in 2008 IEEE 17th Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, 2008, pp. 105–110.
  44. M. Moradi and M. Keyvanpour, “Captcha and its alternatives: A review,” Security and Communication Networks, vol. 8, no. 12, pp. 2135–2156, 2015. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/sec.1157
  45. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
  46. A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research).” [Online]. Available: http://www.cs.toronto.edu/ kriz/cifar.html
  47. P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30.   Curran Associates, Inc., 2017.
  48. M. de Vos and J. Pouwelse, “Contrib: Maintaining fairness in decentralized big tech alternatives by accounting work,” Computer Networks, vol. 192, p. 108081, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1389128621001705
  49. Q. Stokkink, C. U. Ileri, D. Epema, and J. Pouwelse, “Web3 sybil avoidance using network latency,” Computer Networks, vol. 227, p. 109701, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1389128623001469
  50. M. Gjoka, M. Kurant, C. T. Butts, and A. Markopoulou, “Walking in facebook: A case study of unbiased sampling of osns,” in 2010 Proceedings IEEE INFOCOM, 2010, pp. 1–9.
  51. A. Singh, T.-W. Ngan, P. Druschel, and D. S. Wallach, “Eclipse attacks on overlay networks: Threats and defenses,” in Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications, 2006, pp. 1–12.
  52. R. Church and C. ReVelle, “The maximal covering location problem,” in Papers of the regional science association, vol. 32, no. 1.   Springer-Verlag Berlin/Heidelberg, 1974, pp. 101–118.
  53. N. Megiddo, E. Zemel, and S. L. Hakimi, “The maximum coverage location problem,” SIAM Journal on Algebraic Discrete Methods, vol. 4, no. 2, pp. 253–261, 1983. [Online]. Available: https://doi.org/10.1137/0604028
  54. T. Werthenbach and J. Pouwelse, “Towards sybil resilience in decentralized learning,” https://doi.org/10.5281/zenodo.8077387, Jun 2023.
  55. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32.   Curran Associates, Inc., 2019, pp. 8024–8035. [Online]. Available: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
  56. Tribler, “Python implementation of tribler’s ipv8 p2p-networking layer,” https://github.com/Tribler/py-ipv8, 2023.
  57. ——, “Experiment runner framework for ipv8 and tribler,” https://github.com/Tribler/gumby, 2022.
  58. H. Bal, D. Epema, C. de Laat, R. van Nieuwpoort, J. Romein, F. Seinstra, C. Snoek, and H. Wijshoff, “A medium-scale distributed system for computer science research: Infrastructure for the long term,” Computer, vol. 49, no. 05, pp. 54–63, may 2016.
  59. L. Deng, “The mnist database of handwritten digit images for machine learning research,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012.
  60. C. Thapa, M. A. P. Chamikara, and S. Camtepe, “Splitfed: When federated learning meets split learning,” CoRR, vol. abs/2004.12088, 2020. [Online]. Available: https://arxiv.org/abs/2004.12088
  61. Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. [Online]. Available: http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
  62. C. Pappas, D. Chatzopoulos, S. Lalis, and M. Vavalis, “Ipls: A framework for decentralized federated learning,” in 2021 IFIP Networking Conference (IFIP Networking), 2021, pp. 1–6.
  63. S. Alqahtani and M. Demirbas, “Performance analysis and comparison of distributed machine learning systems,” CoRR, vol. abs/1909.02061, 2019. [Online]. Available: http://arxiv.org/abs/1909.02061
  64. J. Verbraeken, M. de Vos, and J. Pouwelse, “Bristle: Decentralized federated learning in byzantine, non-i.i.d. environments,” CoRR, vol. abs/2110.11006, 2021. [Online]. Available: https://arxiv.org/abs/2110.11006
  65. H. Ye, L. Liang, and G. Y. Li, “Decentralized federated learning with unreliable communications,” IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 3, pp. 487–500, 2022.
  66. K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big data, vol. 3, no. 1, pp. 1–40, 2016.
  67. T.-C. Chiu, Y.-Y. Shih, A.-C. Pang, C.-S. Wang, W. Weng, and C.-T. Chou, “Semisupervised distributed learning with non-iid data for aiot service platform,” IEEE Internet of Things Journal, vol. 7, no. 10, pp. 9266–9277, 2020.
  68. K. Hsieh, A. Phanishayee, O. Mutlu, and P. Gibbons, “The non-IID data quagmire of decentralized machine learning,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119.   PMLR, 13–18 Jul 2020, pp. 4387–4398. [Online]. Available: https://proceedings.mlr.press/v119/hsieh20a.html
  69. Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” CoRR, vol. abs/1806.00582, 2018. [Online]. Available: http://arxiv.org/abs/1806.00582
  70. Y. Chen, Y. Ning, M. Slawski, and H. Rangwala, “Asynchronous online federated learning for edge devices with non-iid data,” in 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 15–24.
  71. C. Briggs, Z. Fan, and P. Andras, “Federated learning with hierarchical clustering of local updates to improve training on non-iid data,” in 2020 International Joint Conference on Neural Networks (IJCNN), 2020, pp. 1–9.
  72. G. L. Dirichlet, “Über die reduction der positiven quadratischen formen mit drei unbestimmten ganzen zahlen.” Journal für die reine und angewandte Mathematik (Crelles Journal), vol. 1850, no. 40, pp. 209–227, 1850. [Online]. Available: https://doi.org/10.1515/crll.1850.40.209
  73. L. Gao, H. Fu, L. Li, Y. Chen, M. Xu, and C.-Z. Xu, “Feddc: Federated learning with non-iid data via local drift decoupling and correction,” 2022.
  74. X. Mu, Y. Shen, K. Cheng, X. Geng, J. Fu, T. Zhang, and Z. Zhang, “Fedproc: Prototypical contrastive federated learning on non-iid data,” Future Generation Computer Systems, vol. 143, pp. 93–104, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X23000262
  75. D. Yin, Y. Chen, K. Ramchandran, and P. L. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” CoRR, vol. abs/1803.01498, 2018. [Online]. Available: http://arxiv.org/abs/1803.01498
  76. E. Drott, “Fake streams, listening bots, and click farms: Counterfeiting attention in the streaming music economy,” American Music, vol. 38, no. 2, pp. 153–175, 2020.
  77. Y. Mao, D. Data, S. Diggavi, and P. Tabuada, “Decentralized learning robust to data poisoning attacks,” in 2022 IEEE 61st Conference on Decision and Control (CDC), 2022, pp. 6788–6793.
  78. S. Kaur and S. Jindal, “A survey on machine learning algorithms,” Int J Innovative Res Adv Eng (IJIRAE), vol. 3, no. 11, pp. 2349–2763, 2016.
  79. S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLOS ONE, vol. 10, no. 7, pp. 1–46, 07 2015. [Online]. Available: https://doi.org/10.1371/journal.pone.0130140
  80. M. C. Mozer and P. Smolensky, “Skeletonization: A technique for trimming the fat from a network via relevance assessment,” in Advances in Neural Information Processing Systems, D. Touretzky, Ed., vol. 1.   Morgan-Kaufmann, 1988.
  81. J.-H. Luo, J. Wu, and W. Lin, “Thinet: A filter level pruning method for deep neural network compression,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.

Summary

We haven't generated a summary for this paper yet.