Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Preference-Based Planning in Stochastic Environments: From Partially-Ordered Temporal Goals to Most Preferred Policies (2403.18212v2)

Published 27 Mar 2024 in cs.RO, cs.AI, cs.FL, and cs.LO

Abstract: Human preferences are not always represented via complete linear orders: It is natural to employ partially-ordered preferences for expressing incomparable outcomes. In this work, we consider decision-making and probabilistic planning in stochastic systems modeled as Markov decision processes (MDPs), given a partially ordered preference over a set of temporally extended goals. Specifically, each temporally extended goal is expressed using a formula in Linear Temporal Logic on Finite Traces (LTL$_f$). To plan with the partially ordered preference, we introduce order theory to map a preference over temporal goals to a preference over policies for the MDP. Accordingly, a most preferred policy under a stochastic ordering induces a stochastic nondominated probability distribution over the finite paths in the MDP. To synthesize a most preferred policy, our technical approach includes two key steps. In the first step, we develop a procedure to transform a partially ordered preference over temporal goals into a computational model, called preference automaton, which is a semi-automaton with a partial order over acceptance conditions. In the second step, we prove that finding a most preferred policy is equivalent to computing a Pareto-optimal policy in a multi-objective MDP that is constructed from the original MDP, the preference automaton, and the chosen stochastic ordering relation. Throughout the paper, we employ running examples to illustrate the proposed preference specification and solution approaches. We demonstrate the efficacy of our algorithm using these examples, providing detailed analysis, and then discuss several potential future directions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Amorese P and Lahijanian M (2023) Optimal cost-preference trade-off planning with multiple temporal tasks. arXiv preprint arXiv:2306.13222 .
  2. Aumann RJ (1962) Utility theory without the completeness axiom. Econometrica: Journal of the Econometric Society : 445–462.
  3. Baier C and Katoen JP (2008) Principles of model checking. MIT press.
  4. Baier JA and McIlraith SA (2008) Planning with Preferences. AI Magazine 29(4): 25. 10.1609/aimag.v29i4.2204.
  5. Bertsekas DP and Tsitsiklis JN (1991) An analysis of stochastic shortest path problems. Mathematics of Operations Research 16(3): 580–595.
  6. Bienvenu M, Fritz C and McIlraith SA (2011) Specifying and computing preferred plans. Artificial Intelligence 175(7-8): 1308–1345.
  7. IEEE Transactions on Automatic Control 68(1): 301–316.
  8. Cardona GA, Kamale D and Vasile CI (2023) Mixed integer linear programming approach for control synthesis with weighted signal temporal logic. In: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control. pp. 1–12.
  9. Chatterjee K, Majumdar R and Henzinger TA (2006) Markov decision processes with multiple objectives. In: Annual symposium on theoretical aspects of computer science. Springer, pp. 325–336.
  10. arXiv preprint arXiv:2305.07766 .
  11. MIT press.
  12. In: Enea C and Lal A (eds.) Computer Aided Verification, Lecture Notes in Computer Science. Cham: Springer Nature Switzerland. ISBN 978-3-031-37703-7, pp. 383–396. 10.1007/978-3-031-37703-7_18.
  13. De Giacomo G and Vardi MY (2013) Linear temporal logic and linear dynamic logic on finite traces. In: IJCAI’13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence. Association for Computing Machinery, pp. 854–860.
  14. Dushnik B and Miller EW (1941) Partially ordered sets. American journal of mathematics 63(3): 600–610.
  15. Fishburn PC (1985) Interval graphs and interval orders. Discrete mathematics 55(2): 135–149.
  16. Fu J (2021) Probabilistic planning with preferences over temporal goals. In: 2021 American Control Conference (ACC). IEEE, pp. 4854–4859.
  17. Hansson SO (2001) The structure of values and norms. Cambridge University Press.
  18. Hansson SO and Grüne-Yanoff T (2022) Preferences. The Stanford Encyclopedia of Philosophy.
  19. Hastie R and Dawes RM (2010) Rational choice in an uncertain world: The psychology of judgment and decision making. Sage.
  20. Kulkarni AN and Fu J (2022) Opportunistic qualitative planning in stochastic systems with preferences over temporal logic objectives. arXiv preprint arXiv:2203.13803 .
  21. Lahijanian M and Kwiatkowska M (2016) Specification revision for Markov decision processes with optimal trade-off. In: Proc. 55th Conference on Decision and Control (CDC’16). pp. 7411–7418.
  22. Li L, Rahmani H and Fu J (2023) Probabilistic Planning with Prioritized Preferences over Temporal Logic Objectives. pp. 189–198. ISSN: 1045-0823.
  23. IEEE transactions on software engineering .
  24. Manna Z and Pnueli A (2012) The temporal logic of reactive and concurrent systems: Specification. Springer Science & Business Media.
  25. Massey WA (1987) Stochastic Orderings for Markov Processes on Partially Ordered Spaces. Mathematics of Operations Research 12(2): 350–367. Publisher: INFORMS.
  26. Mehdipour N, Vasile CI and Belta C (2021) Specifying User Preferences Using Weighted Signal Temporal Logic. IEEE Control Systems Letters 5(6): 2006–2011. 10.1109/LCSYS.2020.3047362.
  27. Ok EA et al. (2002) Utility representation of an incomplete preference relation. Journal of Economic Theory 104(2): 429–449.
  28. Rahmani H, Kulkarni AN and Fu J (2023) Probabilistic planning with partially ordered preferences over temporal goals. In: 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp. 5702–5708.
  29. Rahmani H and O’Kane JM (2019) Optimal temporal logic planning with cascading soft constraints. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 2524–2531.
  30. Rahmani H and O’Kane JM (2020) What to do when you can’t do it all: Temporal logic planning with soft temporal logic constraints. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 6619–6626.
  31. Journal of Artificial Intelligence Research 48: 67–113.
  32. Santhanam GR, Basu S and Honavar V (2016) Representing and Reasoning with Qualitative Preferences: Tools and Applications. Synthesis Lectures on Artificial Intelligence and Machine Learning 10(1): 1–154.
  33. In: Proceedings of the 16th international conference on Hybrid systems: computation and control. ACM, pp. 1–10.
  34. In: 2021 American Control Conference (ACC). pp. 4866–4872. 10.23919/ACC50511.2021.9483174.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com