Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
53 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
10 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Numeric Reward Machines (2404.19370v1)

Published 30 Apr 2024 in cs.AI and cs.LG

Abstract: Reward machines inform reinforcement learning agents about the reward structure of the environment and often drastically speed up the learning process. However, reward machines only accept Boolean features such as robot-reached-gold. Consequently, many inherently numeric tasks cannot profit from the guidance offered by reward machines. To address this gap, we aim to extend reward machines with numeric features such as distance-to-gold. For this, we present two types of reward machines: numeric-Boolean and numeric. In a numeric-Boolean reward machine, distance-to-gold is emulated by two Boolean features distance-to-gold-decreased and robot-reached-gold. In a numeric reward machine, distance-to-gold is used directly alongside the Boolean feature robot-reached-gold. We compare our new approaches to a baseline reward machine in the Craft domain, where the numeric feature is the agent-to-target distance. We use cross-product Q-learning, Q-learning with counter-factual experiences, and the options framework for learning. Our experimental results show that our new approaches significantly outperform the baseline approach. Extending reward machines with numeric features opens up new possibilities of using reward machines in inherently numeric tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. Modular Multitask Reinforcement Learning with Policy Sketches. In International conference on machine learning, 166–175. PMLR.
  2. Structured Reward Shaping using Signal Temporal Logic Specifications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 3481–3486. IEEE.
  3. Operational Optimization for Off-Grid Renewable Building Energy System using Deep Reinforcement Learning. Applied Energy, 325: 119783.
  4. Icarte, R. T. 2021. Reward Machines Code. https://github.com/RodrigoToroIcarte/reward_machines. Accessed: March 19, 2024.
  5. Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning. Journal of Artificial Intelligence Research, 73: 173–208.
  6. Symbolic plans as High-level Instructions for Reinforcement Learning. In Proceedings of the international conference on automated planning and scheduling, volume 30, 540–550.
  7. Jothimurugan, K. 2023. Specification-Guided Reinforcement Learning. Ph.D. thesis, University of Pennsylvania.
  8. A Composable Specification Language for Reinforcement Learning Tasks. Advances in Neural Information Processing Systems, 32.
  9. Compositional Reinforcement Learning from Logical Specifications. Advances in Neural Information Processing Systems, 34: 10026–10039.
  10. Translating Omega-regular Specifications to Average Objectives for Model-free Reinforcement Learning. In Proc. of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022),.
  11. Provably safe reinforcement learning: Conceptual analysis, survey, and benchmarking. Transactions on Machine Learning Research.
  12. Reinforcement Learning with Temporal Logic Rewards. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 3834–3839. IEEE.
  13. Policy Invariance under Reward Transformations: Theory and Application to Reward Shaping. In Icml, volume 99, 278–287. Citeseer.
  14. On the Limitations of Markovian Rewards to Express Multi-objective, Risk-sensitive, and Modal Tasks. In Uncertainty in Artificial Intelligence, 1974–1984. PMLR.
  15. Reinforcement Learning: An Introduction. MIT press.
  16. RMLGym: a Formal Reward Machine Framework for Reinforcement Learning. In WOA 2023: 24th Workshop From Objects to Agents.
  17. Q-learning. Machine learning, 8: 279–292.
  18. Mathematical Analysis II, volume 220. Springer.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets