Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contributions on complexity bounds for Deterministic Partially Observed Markov Decision Process (2301.08567v1)

Published 20 Jan 2023 in math.OC

Abstract: Markov Decision Processes (Mdps) form a versatile framework used to model a wide range of optimization problems. The Mdp model consists of sets of states, actions, time steps, rewards, and probability transitions. When in a given state and at a given time, the decision maker's action generates a reward and determines the state at the next time step according to the probability transition function. However, Mdps assume that the decision maker knows the state of the controlled dynamical system. Hence, when one needs to optimize controlled dynamical systems under partial observation, one often turns toward the formalism of Partially Observed Markov Decision Processes (Pomdp). Pomdps are often untractable in the general case as Dynamic Programming suffers from the curse of dimensionality. Instead of focusing on the general Pomdps, we present a subclass where transitions and observations mappings are deterministic: Deterministic Partially Observed Markov Decision Processes (Det-Pomdp). That subclass of problems has been studied by (Littman, 1996) and (Bonet, 2009). It was first considered as a limit case of Pomdps by Littman, mainly used to illustrate the complexity of Pomdps when considering as few sources of uncertainties as possible. In this paper, we improve on Littman's complexity bounds. We then introduce and study an even simpler class: Separated Det-Pomdps and give some new complexity bounds for this class. This new class of problems uses a property of the dynamics and observation to push back the curse of dimensionality.

Citations (1)

Summary

We haven't generated a summary for this paper yet.