Recursively-Constrained Partially Observable Markov Decision Processes (2310.09688v3)

Published 15 Oct 2023 in cs.AI and cs.RO

Abstract: Many sequential decision problems involve optimizing one objective function while imposing constraints on other objectives. Constrained Partially Observable Markov Decision Processes (C-POMDP) model this case with transition uncertainty and partial observability. In this work, we first show that C-POMDPs violate the optimal substructure property over successive decision steps and thus may exhibit behaviors that are undesirable for some (e.g., safety critical) applications. Additionally, online re-planning in C-POMDPs is often ineffective due to the inconsistency resulting from this violation. To address these drawbacks, we introduce the Recursively-Constrained POMDP (RC-POMDP), which imposes additional history-dependent cost constraints on the C-POMDP. We show that, unlike C-POMDPs, RC-POMDPs always have deterministic optimal policies and that optimal policies obey BeLLMan's principle of optimality. We also present a point-based dynamic programming algorithm for RC-POMDPs. Evaluations on benchmark problems demonstrate the efficacy of our algorithm and show that policies for RC-POMDPs produce more desirable behaviors than policies for C-POMDPs.

Authors (8)

Qi Heng Ho (12 papers)
Tyler Becker (5 papers)
Benjamin Kraske (3 papers)
Zakariya Laouar (5 papers)
Martin S. Feather (5 papers)
Federico Rossi (29 papers)
Morteza Lahijanian (59 papers)
Zachary N. Sunberg (20 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Recursively-Constrained Partially Observable Markov Decision Processes (2310.09688v3)

Summary

Related Papers