Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quantum Computing Provides Exponential Regret Improvement in Episodic Reinforcement Learning (2302.08617v1)

Published 16 Feb 2023 in cs.LG, cs.AI, cs.SY, eess.SY, quant-ph, and stat.ML

Abstract: In this paper, we investigate the problem of \textit{episodic reinforcement learning} with quantum oracles for state evolution. To this end, we propose an \textit{Upper Confidence Bound} (UCB) based quantum algorithmic framework to facilitate learning of a finite-horizon MDP. Our quantum algorithm achieves an exponential improvement in regret as compared to the classical counterparts, achieving a regret of $\Tilde{\mathcal{O}}(1)$ as compared to $\Tilde{\mathcal{O}}(\sqrt{K})$ \footnote{$\Tilde{\mathcal{O}}(\cdot)$ hides logarithmic terms.}, $K$ being the number of training episodes. In order to achieve this advantage, we exploit efficient quantum mean estimation technique that provides quadratic improvement in the number of i.i.d. samples needed to estimate the mean of sub-Gaussian random variables as compared to classical mean estimation. This improvement is a key to the significant regret improvement in quantum reinforcement learning. We provide proof-of-concept experiments on various RL environments that in turn demonstrate performance gains of the proposed algorithmic framework.

Citations (5)

Summary

We haven't generated a summary for this paper yet.