Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Feature-Based Q-Learning for Two-Player Stochastic Games (1906.00423v1)

Published 2 Jun 2019 in cs.LG, cs.GT, and stat.ML

Abstract: Consider a two-player zero-sum stochastic game where the transition function can be embedded in a given feature space. We propose a two-player Q-learning algorithm for approximating the Nash equilibrium strategy via sampling. The algorithm is shown to find an $\epsilon$-optimal strategy using sample size linear to the number of features. To further improve its sample efficiency, we develop an accelerated algorithm by adopting techniques such as variance reduction, monotonicity preservation and two-sided strategy approximation. We prove that the algorithm is guaranteed to find an $\epsilon$-optimal strategy using no more than $\tilde{\mathcal{O}}(K/(\epsilon{2}(1-\gamma){4}))$ samples with high probability, where $K$ is the number of features and $\gamma$ is a discount factor. The sample, time and space complexities of the algorithm are independent of original dimensions of the game.

Citations (45)

Summary

We haven't generated a summary for this paper yet.