Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Expressivity of Markov Reward (2111.00876v2)

Published 1 Nov 2021 in cs.LG and cs.AI

Abstract: Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists. We conclude with an empirical study that corroborates and illustrates our theoretical findings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. David Abel (29 papers)
  2. Will Dabney (53 papers)
  3. Anna Harutyunyan (20 papers)
  4. Mark K. Ho (21 papers)
  5. Michael L. Littman (50 papers)
  6. Doina Precup (206 papers)
  7. Satinder Singh (80 papers)
Citations (76)

Summary

We haven't generated a summary for this paper yet.