2000 character limit reached
A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays (2312.15091v2)
Published 22 Dec 2023 in cs.LG and math.OC
Abstract: In this paper, we study asynchronous stochastic approximation algorithms without communication delays. Our main contribution is a stability proof for these algorithms that extends a method of Borkar and Meyn by accommodating more general noise conditions. We also derive convergence results from this stability result and discuss their application in important average-reward reinforcement learning problems.
- V. S. Borkar. Asynchronous stochastic approximations. SIAM J. Control Optim., 36(3):840–851, 1998.
- V. S. Borkar. Erratum: Asynchronous stochastic approximations. SIAM J. Control Optim., 38(2):662–663, 2000.
- V. S. Borkar and S. Meyn. The o.d.e. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.
- V. S. Borkar. Stochastic Approximations: A Dynamical Systems Viewpoint. Springer, New York, 2009.
- Stochastic Approximation and Recursive Algorithms and Applications. Springer, New York, 2nd edition, 2003.
- Learning algorithms for Markov decision processes with average cost. SIAM Journal on Control and Optimization, 40(3):681–698, 2001.
- Learning and planning in average-reward Markov decision processes. In Proc. Int. Conf. Machine Learning (ICML), pages 10653–10662, 2021.
- Average-reward learning and planning with options. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pages 22758–22769, 2021.
- On convergence of average-reward off-policy algorithms in weakly communicating MDPs. arXiv Preprint, forthcoming (2024).
- J. Tsitsiklis. Asynchronous stochastic approximation and Q-learning. Mach. Learning, 16:195–202, 1994.
- H. Yu and D. P. Bertsekas. On boundedness of Q-learning iterates for stochastic shortest path problems. Math. Oper. Res., 38:209–227, 2013.
- R. M. Dudley. Real Analysis and Probability. Cambridge University Press, Cambridge, 2002.
- J. Doob. Stochastic Processes. Wiley and Sons, New York, 1953.
- J. Neveu. Discrete Parameter Martingales. North-Holland, Amsterdam, 1975.
- S. Bhatnagar. The Borkar–Meyn theorem for asynchronous stochastic approximations. Systems Control Lett., 60:472–478, 2011.