DASA: Delay-Adaptive Multi-Agent Stochastic Approximation (2403.17247v3)
Abstract: We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $\tau_{mix}$ and on the average delay $\tau_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.
- S. Khodadadian, P. Sharma, G. Joshi, and S. T. Maguluri, “Federated reinforcement learning: Linear speedup under Markovian sampling,” in Proceedings of International Conference on Machine Learning. PMLR, 2022, pp. 10 997–11 057.
- N. Dal Fabbro, A. Mitra, and G. J. Pappas, “Federated TD-learning over finite-rate erasure channels: Linear speedup under Markovian sampling,” IEEE Control Systems Letters, 2023.
- C. Zhang, H. Wang, A. Mitra, and J. Anderson, “Finite-time analysis of on-policy heterogeneous federated reinforcement learning,” arXiv preprint arXiv:2401.15273, 2024.
- D. P. Bertsekas and J. N. Tsitsiklis, “Convergence rate and termination of asynchronous iterative algorithms,” in Proceedings of the 3rd International Conference on Supercomputing, 1989, pp. 461–470.
- A. Koloskova, S. U. Stich, and M. Jaggi, “Sharper convergence guarantees for asynchronous SGD for distributed and federated learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 17 202–17 215, 2022.
- S. U. Stich and S. P. Karimireddy, “The error-feedback framework: Better rates for SGD with delayed gradients and compressed updates,” Journal of Machine Learning Research, vol. 21, no. 1, pp. 9613–9648, 2020.
- A. Adibi, A. Mitra, and H. Hassani, “Min-max optimization under delays,” arXiv preprint arXiv:2307.06886, 2023.
- H. Shen, K. Zhang, M. Hong, and T. Chen, “Towards understanding asynchronous advantage actor-critic: Convergence and linear speedup,” IEEE Transactions on Signal Processing, vol. 71, pp. 2579–2594, 2023.
- H. R. Feyzmahdavian, A. Aytekin, and M. Johansson, “A delayed proximal gradient method with linear convergence rate,” in 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2014, pp. 1–6.
- A. Adibi, N. Dal Fabbro, L. Schenato, S. Kulkarni, H. V. Poor, G. J. Pappas, H. Hassani, and A. Mitra, “Stochastic approximation with delayed updates: Finite-time rates under Markovian sampling,” arXiv preprint arXiv:2402.11800, 2024.
- J. Bhandari, D. Russo, and R. Singal, “A finite time analysis of temporal difference learning with linear function approximation,” in Proceedings of Conference on learning theory. PMLR, 2018, pp. 1691–1692.
- R. Srikant and L. Ying, “Finite-time error bounds for linear stochastic approximation and TD learning,” in Proceedings of Conference on Learning Theory. PMLR, 2019, pp. 2803–2830.
- Z. Chen, S. Zhang, T. T. Doan, J.-P. Clarke, and S. T. Maguluri, “Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning,” Automatica, vol. 146, p. 110623, 2022.
- S. Meyn, “Stability of Q-Learning through design and optimism,” arXiv preprint arXiv:2307.02632, 2023.
- A. Cohen, A. Daniely, Y. Drori, T. Koren, and M. Schain, “Asynchronous stochastic optimization robust to arbitrary delays,” Advances in Neural Information Processing Systems, vol. 34, pp. 9024–9035, 2021.
- H. Wang, A. Mitra, H. Hassani, G. J. Pappas, and J. Anderson, “Federated temporal difference learning with linear function approximation under environmental heterogeneity,” 2023.
- Arman Adibi (12 papers)
- H. Vincent Poor (884 papers)
- Sanjeev R. Kulkarni (32 papers)
- Aritra Mitra (37 papers)
- George J. Pappas (208 papers)
- Nicolò Dal Fabbro (9 papers)