Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 80 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 32 tok/s Pro

GPT-4o 92 tok/s Pro

Kimi K2 182 tok/s Pro

GPT OSS 120B 438 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Hierarchical model-based policy optimization: from actions to action sequences and back (1912.01448v2)

Published 28 Nov 2019 in cs.LG, cs.AI, math.OC, and stat.ML

Abstract: We develop a normative framework for hierarchical model-based policy optimization based on applying second-order methods in the space of all possible state-action paths. The resulting natural path gradient performs policy updates in a manner which is sensitive to the long-range correlational structure of the induced stationary state-action densities. We demonstrate that the natural path gradient can be computed exactly given an environment dynamics model and depends on expressions akin to higher-order successor representations. In simulation, we show that the priorization of local policy updates in the resulting policy flow indeed reflects the intuitive state-space hierarchy in several toy problems.

Citations (1)

View on Semantic Scholar