Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Non-Coherent Over-the-Air Decentralized Gradient Descent (2211.10777v4)

Published 19 Nov 2022 in eess.SP, cs.IT, cs.LG, and math.IT

Abstract: Implementing Decentralized Gradient Descent (DGD) in wireless systems is challenging due to noise, fading, and limited bandwidth, necessitating topology awareness, transmission scheduling, and the acquisition of channel state information (CSI) to mitigate interference and maintain reliable communications. These operations may result in substantial signaling overhead and scalability challenges in large networks lacking central coordination. This paper introduces a scalable DGD algorithm that eliminates the need for scheduling, topology information, or CSI (both average and instantaneous). At its core is a Non-Coherent Over-The-Air (NCOTA) consensus scheme that exploits a noisy energy superposition property of wireless channels. Nodes encode their local optimization signals into energy levels within an OFDM frame and transmit simultaneously, without coordination. The key insight is that the received energy equals, on average, the sum of the energies of the transmitted signals, scaled by their respective average channel gains, akin to a consensus step. This property enables unbiased consensus estimation, utilizing average channel gains as mixing weights, thereby removing the need for their explicit design or for CSI. Introducing a consensus stepsize mitigates consensus estimation errors due to energy fluctuations around their expected values. For strongly-convex problems, it is shown that the expected squared distance between the local and globally optimum models vanishes at a rate of O(1/sqrt{k}) after k iterations, with suitable decreasing learning and consensus stepsizes. Extensions accommodate a broad class of fading models and frequency-selective channels. Numerical experiments on image classification demonstrate faster convergence in terms of running time compared to state-of-the-art schemes, especially in dense network scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. N. Michelusi, “Decentralized federated learning via non-coherent over-the-air consensus,” in ICC 2023 - IEEE International Conference on Communications, 2023, pp. 3102–3107.
  2. ——, “Csi-free over-the-air decentralized learning over frequency selective channels,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 076–13 080.
  3. S. Kar and J. M. Moura, “Consensus + innovations distributed inference over networks: cooperation and sensing in networked systems,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 99–109, 2013.
  4. T. Yang, X. Yi, J. Wu, Y. Yuan, D. Wu, Z. Meng, Y. Hong, H. Wang, Z. Lin, and K. H. Johansson, “A survey of distributed optimization,” Annual Reviews in Control, vol. 47, pp. 278–305, 2019.
  5. G. Zhu, Y. Wang, and K. Huang, “Broadband Analog Aggregation for Low-Latency Federated Edge Learning,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 491–506, 2020.
  6. X. Lian, C. Zhang, H. Zhang, C.-J. Hsieh, W. Zhang, and J. Liu., “Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent,” in Proc. 31st NeurIPS, Dec. 2017.
  7. Y. Xiao, Y. Ye, S. Huang, L. Hao, Z. Ma, M. Xiao, S. Mumtaz, and O. A. Dobre, “Fully Decentralized Federated Learning-Based On-Board Mission for UAV Swarm System,” IEEE Communications Letters, vol. 25, no. 10, pp. 3296–3300, 2021.
  8. S. Savazzi, M. Nicoli, and V. Rampa, “Federated learning with cooperating devices: A consensus approach for massive iot networks,” IEEE Internet of Things Journal, vol. 7, no. 5, pp. 4641–4654, 2020.
  9. A. Nedić and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Trans. Autom. Control, vol. 54, no. 1, pp. 48–61, Jan. 2009.
  10. K. Yuan, Q. Ling, and W. Yin, “On the Convergence of Decentralized Gradient Descent,” SIAM Journal on Optimization, vol. 26, no. 3, pp. 1835–1854, 2016.
  11. R. Xin, S. Pu, A. Nedic, and U. A. Khan, “A general framework for decentralized optimization with first-order methods,” Proceedings of the IEEE, vol. 108, no. 11, pp. 1869–1889, 2020.
  12. K. Yang, T. Jiang, Y. Shi, and Z. Ding, “Federated learning via over-the-air computation,” IEEE Transactions on Wireless Communications, vol. 19, no. 3, pp. 2022–2035, 2020.
  13. M. M. Amiri and D. Gündüz, “Federated learning over wireless fading channels,” IEEE Transactions on Wireless Communications, vol. 19, no. 5, pp. 3546–3557, 2020.
  14. M. M. Amiri, D. Gündüz, S. R. Kulkarni, and H. V. Poor, “Convergence of federated learning over a noisy downlink,” IEEE Transactions on Wireless Communications, vol. 21, no. 3, pp. 1422–1437, 2022.
  15. M. M. Amiri, T. M. Duman, D. Gündüz, S. R. Kulkarni, and H. V. Poor, “Blind federated edge learning,” IEEE Transactions on Wireless Communications, vol. 20, no. 8, pp. 5129–5143, 2021.
  16. M. Mohammadi Amiri and D. Gündüz, “Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,” IEEE Transactions on Signal Processing, vol. 68, pp. 2155–2169, 2020.
  17. H. Xing, O. Simeone, and S. Bi, “Federated learning over wireless device-to-device networks: Algorithms and convergence analysis,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3723–3741, 2021.
  18. Y. Shi, Y. Zhou, and Y. Shi, “Over-the-air decentralized federated learning,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 455–460.
  19. E. Ozfatura, S. Rini, and D. Gündüz, “Decentralized sgd with over-the-air computation,” in GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1–6.
  20. A. Reisizadeh, A. Mokhtari, H. Hassani, and R. Pedarsani, “An exact quantized decentralized gradient descent algorithm,” IEEE Transactions on Signal Processing, vol. 67, no. 19, pp. 4934–4947, 2019.
  21. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” CoRR, vol. abs/1708.07747, 2017.
  22. J. Tsitsiklis, D. Bertsekas, and M. Athans, “Distributed asynchronous deterministic and stochastic gradient optimization algorithms,” IEEE Transactions on Automatic Control, vol. 31, no. 9, pp. 803–812, 1986.
  23. H. Taheri, A. Mokhtari, H. Hassani, and R. Pedarsani, “Quantized decentralized stochastic learning over directed graphs,” in Proc. 37th ICML, Jul. 2020.
  24. D. Kovalev, A. Koloskova, M. Jaggi, P. Richtárik, and S. U. Stich, “A linearly convergent algorithm for decentralized optimization: Sending less bits for free!” in Proc. 24th AISTATS, Apr. 2021.
  25. Y. Liao, Z. Li, K. Huang, and S. Pu, “A compressed gradient tracking method for decentralized optimization with linear convergence,” IEEE Trans. on Automatic Control, vol. 67, no. 10, pp. 5622–5629, 2022.
  26. N. Michelusi, G. Scutari, and C.-S. Lee, “Finite-Bit Quantization for Distributed Algorithms With Linear Convergence,” IEEE Transactions on Information Theory, vol. 68, no. 11, pp. 7254–7280, 2022.
  27. Y. Kajiyama, N. Hayashi, and S. Takai, “Linear convergence of consensus-based quantized optimization for smooth and strongly convex cost functions,” IEEE Transactions on Automatic Control, vol. 66, no. 3, pp. 1254–1261, 2021.
  28. S. Magnússon, H. Shokri-Ghadikolaei, and N. Li, “On maintaining linear convergence of distributed learning and optimization under limited communication,” IEEE Trans. Signal Process., vol. 68, pp. 6101–6116, Oct. 2020.
  29. R. Saha, S. Rini, M. Rao, and A. J. Goldsmith, “Decentralized optimization over noisy, rate-constrained networks: Achieving consensus by communicating differences,” IEEE Journal on Selected Areas in Communications, vol. 40, no. 2, pp. 449–467, 2022.
  30. Z. Jiang, G. Yu, Y. Cai, and Y. Jiang, “Decentralized Edge Learning via Unreliable Device-to-Device Communications,” IEEE Transactions on Wireless Communications, vol. 21, no. 11, pp. 9041–9055, 2022.
  31. H. Ye, L. Liang, and G. Y. Li, “Decentralized Federated Learning With Unreliable Communications,” IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 3, pp. 487–500, 2022.
  32. E. Jeong, M. Zecchin, and M. Kountouris, “Asynchronous Decentralized Learning over Unreliable Wireless Networks,” in ICC 2022 - IEEE International Conference on Communications, 2022, pp. 607–612.
  33. M. Chen, D. Gündüz, K. Huang, W. Saad, M. Bennis, A. V. Feljan, and H. V. Poor, “Distributed learning in wireless networks: Recent progress and future challenges,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3579–3605, 2021.
  34. B. Nazer and M. Gastpar, “Computation Over Multiple-Access Channels,” IEEE Transactions on Information Theory, vol. 53, no. 10, pp. 3498–3516, 2007.
  35. J. Choi, “Communication-Efficient Distributed SGD using Random Access for Over-the-Air Computation,” IEEE Journal on Selected Areas in Information Theory, pp. 1–1, 2022.
  36. J. Dong, Y. Shi, and Z. Ding, “Blind Over-the-Air Computation and Data Fusion via Provable Wirtinger Flow,” IEEE Transactions on Signal Processing, vol. 68, pp. 1136–1151, 2020.
  37. T. Sery and K. Cohen, “On Analog Gradient Descent Learning Over Multiple Access Fading Channels,” IEEE Transactions on Signal Processing, vol. 68, pp. 2897–2911, 2020.
  38. V. Gandikota, D. Kane, R. K. Maity, and A. Mazumdar, “vqSGD: Vector Quantized Stochastic Gradient Descent,” IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4573–4587, 2022.
  39. C.-S. Lee, N. Michelusi, and G. Scutari, “Finite rate distributed weight-balancing and average consensus over digraphs,” IEEE Transactions on Automatic Control, vol. 66, no. 10, pp. 4530–4545, 2021.
  40. M. Chen, H. V. Poor, W. Saad, and S. Cui, “Wireless communications for collaborative federated learning,” IEEE Communications Magazine, vol. 58, no. 12, pp. 48–54, 2020.
  41. F. P.-C. Lin, S. Hosseinalipour, S. S. Azam, C. G. Brinton, and N. Michelusi, “Semi-decentralized federated learning with cooperative d2d local model aggregations,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3851–3869, 2021.
  42. M. Yemini, R. Saha, E. Ozfatura, D. Gündüz, and A. J. Goldsmith, “Semi-decentralized federated learning with collaborative relaying,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 1471–1476.
  43. S. Hosseinalipour, S. S. Azam, C. G. Brinton, N. Michelusi, V. Aggarwal, D. J. Love, and H. Dai, “Multi-stage hybrid federated learning over large-scale d2d-enabled fog networks,” IEEE/ACM Transactions on Networking, vol. 30, no. 4, pp. 1569–1584, 2022.
  44. S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive Federated Learning in Resource Constrained Edge Computing Systems,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 6, pp. 1205–1221, 2019.
  45. V. Schwarz, G. Hannak, and G. Matz, “On the convergence of average consensus with generalized metropolis-hasting weights,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, pp. 5442–5446.
  46. J. Wu, W. Hu, H. Xiong, J. Huan, V. Braverman, and Z. Zhu, “On the Noisy Gradient Descent that Generalizes as SGD,” in ICML, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com