Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Specifying Non-Markovian Rewards in MDPs Using LDL on Finite Traces (Preliminary Version) (1706.08100v1)

Published 25 Jun 2017 in cs.AI

Abstract: In Markov Decision Processes (MDPs), the reward obtained in a state depends on the properties of the last state and action. This state dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle such non-Markovian reward function was the subject of two previous lines of work, both using variants of LTL to specify the reward function and then compiling the new model back into a Markovian model. Building upon recent progress in the theories of temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ronen Brafman (3 papers)
  2. Giuseppe De Giacomo (41 papers)
  3. Fabio Patrizi (13 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.