Papers
Topics
Authors
Recent
Search
2000 character limit reached

SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

Published 21 Apr 2020 in cs.LG, cs.AI, and stat.ML | (2004.09846v3)

Abstract: We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE. The approach is designed for use in conjunction with any existing RL algorithm, and consists of rewarding improvement over the agent's own past performance. We prove that SIBRE converges in expectation under the same conditions as the original RL algorithm. The reshaped rewards help discriminate between policies when the original rewards are weakly discriminated or sparse. Experiments on several well-known benchmark environments with different RL algorithms show that SIBRE converges to the optimal policy faster and more stably. We also perform sensitivity analysis with respect to hyper-parameters, in comparison with baseline RL algorithms.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.