Papers
Topics
Authors
Recent
2000 character limit reached

Reinforcement with Fading Memories

Published 29 Jul 2019 in math.PR and cs.LG | (1907.12227v2)

Abstract: We study the effect of imperfect memory on decision making in the context of a stochastic sequential action-reward problem. An agent chooses a sequence of actions which generate discrete rewards at different rates. She is allowed to make new choices at rate $\beta$, while past rewards disappear from her memory at rate $\mu$. We focus on a family of decision rules where the agent makes a new choice by randomly selecting an action with a probability approximately proportional to the amount of past rewards associated with each action in her memory. We provide closed-form formulae for the agent's steady-state choice distribution in the regime where the memory span is large ($\mu \to 0$), and show that the agent's success critically depends on how quickly she updates her choices relative to the speed of memory decay. If $\beta \gg \mu$, the agent almost always chooses the best action, i.e., the one with the highest reward rate. Conversely, if $\beta \ll \mu$, the agent chooses an action with a probability roughly proportional to its reward rate.

Citations (10)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.