Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces (2106.15594v1)

Published 29 Jun 2021 in math.OC and cs.LG

Abstract: This paper addresses the problem of optimal control using search trees. We start by considering multi-armed bandit problems with continuous action spaces and propose LD-HOO, a limited depth variant of the hierarchical optimistic optimization (HOO) algorithm. We provide a regret analysis for LD-HOO and show that, asymptotically, our algorithm exhibits the same cumulative regret as the original HOO while being faster and more memory efficient. We then propose a Monte Carlo tree search algorithm based on LD-HOO for optimal control problems and illustrate the resulting approach's application in several optimal control problems.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces (2106.15594v1)

Summary

Follow-up Questions

Related Papers

Authors (3)