Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent (2402.10228v5)

Published 5 Feb 2024 in cs.LG, cs.AI, and stat.ML

Abstract: We propose HyperAgent, a reinforcement learning (RL) algorithm based on the hypermodel framework for exploration in RL. HyperAgent allows for the efficient incremental approximation of posteriors associated with an optimal action-value function ($Q^\star$) without the need for conjugacy and follows the greedy policies w.r.t. these approximate posterior samples. We demonstrate that HyperAgent offers robust performance in large-scale deep RL benchmarks. It can solve Deep Sea hard exploration problems with episodes that optimally scale with problem size and exhibits significant efficiency gains in the Atari suite. Implementing HyperAgent requires minimal code addition to well-established deep RL frameworks like DQN. We theoretically prove that, under tabular assumptions, HyperAgent achieves logarithmic per-step computational complexity while attaining sublinear regret, matching the best known randomized tabular RL algorithm.

References (77)

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/RichardYRLi/status/1834856104880488816

https://twitter.com/RichardYRLi/status/1766177817954570727

https://twitter.com/RichardYRLi/status/1814970211751416080

https://twitter.com/RichardYRLi/status/1870135248564212092

https://twitter.com/StatMLPapers/status/1759443756267221353

https://twitter.com/gastronomy/status/1759445086193037400

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent (2402.10228v5)

Summary

Related Papers

Tweets