Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates (2408.00147v1)
Abstract: When designing agents for operation in uncertain environments, designers need tools to automatically reason about what agents ought to do, how that conflicts with what is actually happening, and how a policy might be modified to remove the conflict. These obligations include ethical and social obligations, permissions and prohibitions, which constrain how the agent achieves its mission and executes its policy. We propose a new deontic logic, Expected Act Utilitarian deontic logic, for enabling this reasoning at design time: for specifying and verifying the agent's strategic obligations, then modifying its policy from a reference policy to meet those obligations. Unlike approaches that work at the reward level, working at the logical level increases the transparency of the trade-offs. We introduce two algorithms: one for model-checking whether an RL agent has the right strategic obligations, and one for modifying a reference decision policy to make it meet obligations expressed in our logic. We illustrate our algorithms on DAC-MDPs which accurately abstract neural decision policies, and on toy gridworld environments.
- Reinforcement Learning as a Framework for Ethical Decision Making. In AAAI Workshop: AI, Ethics, and Society.
- Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Altman, E. 2021. Constrained Markov decision processes. Routledge.
- Value Alignment or Misalignment - What Will Keep Systems Accountable? In AAAI Workshops.
- Efficient Sensitivity Analysis for Parametric Robust Markov Chains. In Enea, C.; and Lal, A., eds., Computer Aided Verification, 62–85. Cham: Springer Nature Switzerland. ISBN 978-3-031-37709-9.
- Principles of model checking. MIT press.
- Bellman, R. 1957. A Markovian Decision Process. Indiana Univ. Math. J., 6: 679–684.
- Toward a General Logicist Methodology for Engineering Ethically Correct Robots. IEEE Intelligent Systems, 21: 38–44.
- Using STIT theory to talk about strategies. Models of strategic reasoning: Logics, games, and communities, 137–173.
- Chellas, B. 1968. The Logical Form of Imperatives. Department of Philosophy, Stanford University.
- Implementable ethics for autonomous vehicles. In Autonomes fahren, 87–102. Springer.
- A review of safe reinforcement learning: Methods, theory and applications. arXiv preprint arXiv:2205.10330.
- Recent advances in reinforcement learning in finance. Mathematical Finance, 33(3): 437–503.
- The probabilistic model checker Storm. International Journal on Software Tools for Technology Transfer, 1–22.
- Deontic Logic: A historical survey and introduction, 3–136. College Publications.
- Horty, J. 2001. Agency and Deontic Logic. Cambridge University Press.
- Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration. IBM J. Res. Dev., 63: 2:1–2:9.
- Deep reinforcement learning framework for autonomous driving. arXiv preprint arXiv:1704.02532.
- Generating Deontic Obligations From Utility-Maximizing Systems. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 653–663.
- Deepaveragers: Offline reinforcement learning by solving derived non-parametric mdps. arXiv preprint arXiv:2010.08891.
- Comparing policy-gradient algorithms. IEEE Transactions on Systems, Man, and Cybernetics.
- Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8: 229–256.
- Robust control of uncertain Markov Decision Processes with temporal logic specifications. In 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), 3372–3379.
- A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning Agents. In AAAI Conference on Artificial Intelligence.
- Reinforcement learning in healthcare: A survey. ACM Computing Surveys (CSUR), 55(1): 1–36.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.