Papers
Topics
Authors
Recent
2000 character limit reached

Towards an Unified Structure for Reinforcement Learning: an Optimization Approach (2002.06883v3)

Published 17 Feb 2020 in eess.SY and cs.SY

Abstract: Both the optimal value function and the optimal policy can be used to model an optimal controller based on the duality established by the Bellman equation. Even with this duality, no parametric model has been able to output both policy and value function with a common parameter set. In this paper, a unified structure is proposed with a parametric optimization problem. The policy and the value function modelled by this structure share all parameters, which enables seamless switching among reinforcement learning algorithms while continuing to learn. The Q-learning and policy gradient based on the proposed structure is detailed. An actor-critic algorithm based on this structure, whose actor and critic are both modelled by the same parameters, is validated by both linear and nonlinear control.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.