Safe Explicable Planning (2304.03773v4)
Abstract: Human expectations arise from their understanding of others and the world. In the context of human-AI interaction, this understanding may not align with reality, leading to the AI agent failing to meet expectations and compromising team performance. Explicable planning, introduced as a method to bridge this gap, aims to reconcile human expectations with the agent's optimal behavior, facilitating interpretable decision-making. However, an unresolved critical issue is ensuring safety in explicable planning, as it could result in explicable behaviors that are unsafe. To address this, we propose Safe Explicable Planning (SEP), which extends the prior work to support the specification of a safety bound. The goal of SEP is to find behaviors that align with human expectations while adhering to the specified safety criterion. Our approach generalizes the consideration of multiple objectives stemming from multiple models rather than a single model, yielding a Pareto set of safe explicable policies. We present both an exact method, guaranteeing finding the Pareto set, and a more efficient greedy method that finds one of the policies in the Pareto set. Additionally, we offer approximate solutions based on state aggregation to improve scalability. We provide formal proofs that validate the desired theoretical properties of these methods. Evaluation through simulations and physical robot experiments confirms the effectiveness of our approach for safe explicable planning.
- State abstractions for lifelong reinforcement learning. In ICML.
- Near optimal behavior via approximate state abstraction. In ICML.
- Altman, E. 1994. Denumerable constrained Markov decision processes and finite approximations. Mathematics of operations research.
- Altman, E. 2021. Constrained Markov decision processes. Routledge.
- Bayesian theory of mind: Modeling joint belief-desire attribution. In CogSci.
- Learning all optimal policies with multiple criteria. In ICML.
- Computation of weighted sums of rewards for concurrent MDPs. MMOR.
- Explicability? legibility? predictability? transparency? privacy? security? the emerging landscape of interpretable agent behavior. In ICAPS.
- The emerging landscape of explainable ai planning and decision making. arXiv.
- Deep reinforcement learning from human preferences. In NeurIPS.
- Legibility and predictability of robot motion. In HRI.
- Generating Legible Motion. In RSS.
- Solving k-mdps. In ICAPS.
- Explainable planning. arXiv.
- A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research.
- Explicable Policy Search. In NeurIPS.
- Multi-criteria Reinforcement Learning. In ICML.
- Safe Explicable Planning. arXiv:2304.03773.
- Generating Active Explicable Plans in Human-Robot Teaming. In IROS.
- Holmes, M.; et al. 2004. Schema learning: Experience-based construction of predictive action models.
- Online risk-bounded motion planning for autonomous vehicles in dynamic environments. In Proceedings of the International Conference on Automated Planning and Scheduling.
- Reward learning from human preferences and demonstrations in atari. In NeurIPS.
- Learning probably approximately complete and safe action models for stochastic worlds. In AAAI.
- Explicable Robot Planning as Minimizing Distance from Expected Behavior. arXiv.
- Action selection for transparent planning. In AAMAS.
- Safe exploration in markov decision processes. arXiv.
- Revisiting Multi-Objective MDPs with Relaxed Lexicographic Preferences. In AAAI Fall Symposium Series.
- A survey of multi-objective sequential decision-making. JAIR.
- Q-decomposition for reinforcement learning agents. In ICML.
- How to dynamically merge Markov decision processes. In NeurIPS.
- Reinforcement learning: An introduction. MIT press.
- Sỳkora, O. 2008. State-space dimensionality reduction in Markov decision processes. WDS.
- Scalarized multi-objective reinforcement learning: Novel design techniques. In ADPRL.
- Solution procedures for multi-objective Markov decision processes. Journal of Mathematical Programming and Operations Research.
- White, D. 1982. Multi-objective infinite-horizon discounted Markov decision processes. Journal of mathematical analysis and applications.
- Multi-Objective MDPs with Conditional Lexicographic Reward Preferences. In AAAI.
- Plan explicability and predictability for robot task planning. In ICRA.