Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 155 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 218 tok/s Pro
GPT OSS 120B 429 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Event Tables for Efficient Experience Replay (2211.00576v2)

Published 1 Nov 2022 in cs.LG and cs.AI

Abstract: Experience replay (ER) is a crucial component of many deep reinforcement learning (RL) systems. However, uniform sampling from an ER buffer can lead to slow convergence and unstable asymptotic behaviors. This paper introduces Stratified Sampling from Event Tables (SSET), which partitions an ER buffer into Event Tables, each capturing important subsequences of optimal behavior. We prove a theoretical advantage over the traditional monolithic buffer approach and combine SSET with an existing prioritized sampling strategy to further improve learning speed and stability. Empirical results in challenging MiniGrid domains, benchmark RL environments, and a high-fidelity car racing simulator demonstrate the advantages and versatility of SSET over existing ER buffer sampling approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Hindsight experience replay. In Advances in Neural Information Processing Systems, 2017.
  2. Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1):81–138, 1995. ISSN 0004-3702. doi: https://doi.org/10.1016/0004-3702(94)00011-O. URL https://www.sciencedirect.com/science/article/pii/000437029400011O.
  3. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
  4. Reverb: A framework for experience replay. arXiv preprint arXiv:2102.04736, 2021.
  5. Minimalistic gridworld environment for OpenAI gym. https://github.com/maximecb/gym-minigrid, 2018.
  6. On the lambert w function. Advances in Computational Mathematics, 5(1):329–359, 1996.
  7. The importance of experience replay database composition in deep reinforcement learning. In Advances in Neural Information Processing Systems (NIPS-DRLWS), 2015.
  8. Improved deep reinforcement learning for robotics through distribution-based experience retention. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.  3947–3952, 2016.
  9. Lucid dreaming for experience replay: refreshing past states with the current policy. Neural Computing and Applications, 34(3):1687–1712, 2022.
  10. Super-human performance in Gran Turismo Sport using deep reinforcement learning. IEEE Robotics and Automation Letters, 6(3):4257–4264, 2021. doi: 10.1109/LRA.2021.3064284.
  11. An empirical investigation of catastrophic forgeting in gradientbased neural networks. In International Conference on Learning Representations (ICLR), 2014.
  12. Marek Grzes. Reward shaping in episodic reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2017.
  13. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pp. 1861–1870, 2018.
  14. Topological experience replay. In International Conference on Learning Representations (ICLR), 2021.
  15. Efficient diversified mini-batch selection using variable high-layer features. In Asian Conference on Machine Learning (ACML), 2019.
  16. Selective experience replay for lifelong learning. In AAAI Conference on Artificial Intelligence, 2018.
  17. Barc: Backward reachability curriculum for robotic reinforcement learning. In IEEE International Conference on Robotics and Automation (ICRA), 2019.
  18. Experience replay using transition sequences. Frontiers in Neurorobotics, 12:32, 2018.
  19. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017. doi: 10.1073/pnas.1611835114. URL https://www.pnas.org/doi/abs/10.1073/pnas.1611835114.
  20. Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots. Artificial Intelligence, 247:313–335, 2017.
  21. Deep successor reinforcement learning. arXiv preprint arXiv:1606.02396, 2016.
  22. Periodic Q-learning. In Learning for Dynamics and Control, 2020.
  23. Sample-efficient deep reinforcement learning via episodic backward update. In Advances in Neural Information Processing Systems, 2019.
  24. A note on target Q-learning for solving finite MDPs with a generative oracle. arXiv preprint arXiv:2203.11489, 2022.
  25. Long-Ji Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3–4):293–321, May 1992.
  26. Adaptive auxiliary task weighting for reinforcement learning. In Advances in Neural Information Processing Systems, 2019.
  27. Conflict-averse gradient descent for multi-task learning. In Advances in neural information processing systems, 2021.
  28. Weighted importance sampling for off-policy learning with linear function approximation. In Advances in Neural Information Processing Systems, 2014.
  29. Automatic discovery of subgoals in reinforcement learning using diverse density. In International Conference on Machine Learning (ICML), 2001.
  30. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  31. Language understanding for textbased games using deep reinforcement learning. In Conference on Empirical Methods in Natural Language Processing, 2015.
  32. Policy invariance under reward transformations: Theory and application to reward shaping. In International Conference on Machine Learning (ICML), 1999.
  33. Model-augmented prioritized experience replay. In International Conference on Learning Representations (ICLR), 2021.
  34. Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
  35. Experience replay for continual learning. In Advances in Neural Information Processing Systems, 2019. URL https://proceedings.neurips.cc/paper/2019/file/fa7cdfad1a5aaf8370ebeda47a1ff1c3-Paper.pdf.
  36. Prioritized experience replay. In International Conference on Learning Representations (ICLR), 2016.
  37. Stratified sampling based experience replay for efficient camera selection decisions. In IEEE International Conference on Multimedia Big Data (BigMM), 2020.
  38. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484–489, january 2016.
  39. Autonomous Overtaking in Gran Turismo Sport Using Curriculum Reinforcement Learning. In IEEE International Conference on Robotics and Automation (ICRA), 2021.
  40. Reinforcement learning: An introduction. MIT press, 2018.
  41. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181–211, 1999.
  42. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pp.  5026–5033. IEEE, 2012.
  43. Deep reinforcement learning with double Q-learning. In AAAI Conference on Artificial Intelligence, 2016.
  44. Q-learning. Machine learning, 8(3):279–292, 1992.
  45. Outracing champion Gran Turismo drivers with deep reinforcement learning. Nature, 602:223–228, 2022.
  46. Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In AAAI Conference on Artificial Intelligence, 2017.
  47. Experience replay optimization. In International Joint Conference on Artificial Intelligence (IJCAI), 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.