Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PTDRL: Parameter Tuning using Deep Reinforcement Learning (2306.10833v1)

Published 19 Jun 2023 in cs.RO

Abstract: A variety of autonomous navigation algorithms exist that allow robots to move around in a safe and fast manner. However, many of these algorithms require parameter re-tuning when facing new environments. In this paper, we propose PTDRL, a parameter-tuning strategy that adaptively selects from a fixed set of parameters those that maximize the expected reward for a given navigation system. Our learning strategy can be used for different environments, different platforms, and different user preferences. Specifically, we attend to the problem of social navigation in indoor spaces, using a classical motion planning algorithm as our navigation system and training its parameters to optimize its behavior. Experimental results show that PTDRL can outperform other online parameter-tuning strategies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE Access, vol. 8, pp. 58 443–58 469, 2020.
  2. D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997.
  3. S. Quinlan and O. Khatib, “Elastic bands: connecting path planning and control,” in [1993] Proceedings IEEE International Conference on Robotics and Automation, 1993, pp. 802–807 vol.2.
  4. L. Tai, G. Paolo, and M. Liu, “Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 31–36.
  5. M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to end learning for self-driving cars,” CoRR, vol. abs/1604.07316, 2016. [Online]. Available: http://arxiv.org/abs/1604.07316
  6. H.-T. L. Chiang, A. Faust, M. Fiser, and A. Francis, “Learning navigation behaviors end-to-end with autorl,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 2007–2014, 2019.
  7. Y. F. Chen, M. Everett, M. Liu, and J. P. How, “Socially aware motion planning with deep reinforcement learning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 1343–1350.
  8. C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 6015–6022.
  9. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2014. [Online]. Available: http://arxiv.org/abs/1312.6199
  10. J. Uesato, A. Kumar, C. Szepesvári, T. Erez, A. Ruderman, K. Anderson, K. Dvijotham, N. Heess, and P. Kohli, “Rigorous agent evaluation: An adversarial approach to uncover catastrophic failures,” CoRR, vol. abs/1812.01647, 2018. [Online]. Available: http://arxiv.org/abs/1812.01647
  11. [Online]. Available: https://www.npr.org/2022/06/15/1105252793/nearly- 400-car-crashes-in-11-months-involved- automated-tech-companies-tell-regul
  12. [Online]. Available: https://www.nytimes.com/2021/04/18/business/tesla-fatal-crash-texas.html
  13. X. Xiao, Z. Wang, Z. Xu, B. Liu, G. Warnell, G. Dhamankar, A. Nair, and P. Stone, “Appld: Adaptive planner parameter learning,” Robotics and Autonomous Systems, vol. 154, p. 104132, 2022.
  14. Z. Xu, G. Dhamankar, A. Nair, X. Xiao, G. Warnell, B. Liu, Z. Wang, and P. Stone, “APPLR: adaptive planner parameter learning from reinforcement,” CoRR, vol. abs/2011.00397, 2020. [Online]. Available: https://arxiv.org/abs/2011.00397
  15. D. Perille, A. Truong, X. Xiao, and P. Stone, “Benchmarking metric ground navigation,” CoRR, vol. abs/2008.13315, 2020. [Online]. Available: https://arxiv.org/abs/2008.13315
  16. AWS RoboMaker, “AWS RoboMaker Hospital World ROS package,” GitHub repository, jul 2021. [Online]. Available: https://github.com/aws-robotics/aws-robomaker-hospital-world
  17. D. Helbing and P. Molnár, “Social force model for pedestrian dynamics,” Physical Review E, vol. 51, pp. 4282–4286, May 1995. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevE.51.4282
  18. K. Zheng, “ROS navigation tuning guide,” CoRR, vol. abs/1706.09068, 2017. [Online]. Available: http://arxiv.org/abs/1706.09068
  19. Z. Wang, X. Xiao, B. Liu, G. Warnell, and P. Stone, “APPLI: adaptive planner parameter learning from interventions,” CoRR, vol. abs/2011.00400, 2020. [Online]. Available: https://arxiv.org/abs/2011.00400
  20. Z. Wang, X. Xiao, G. Warnell, and P. Stone, “Apple: Adaptive planner parameter learning from evaluative feedback,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7744–7749, 2021.
  21. S. Niekum, S. Osentoski, C. Atkeson, and A. G. Barto, “Champ: Changepoint detection using approximate model parameters,” Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMU-RI-TR-14-10, June 2014.
  22. D. Ha and J. Schmidhuber, “World models,” CoRR, vol. abs/1803.10122, 2018. [Online]. Available: http://arxiv.org/abs/1803.10122
  23. H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” CoRR, vol. abs/1509.06461, 2015. [Online]. Available: http://arxiv.org/abs/1509.06461
  24. M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Ng, “Ros: an open-source robot operating system,” in ICRA Workshop on Open Source Software, vol. 3, 01 2009.
  25. I. Zamora, N. G. Lopez, V. M. Vilches, and A. H. Cordero, “Extending the openai gym for robotics: a toolkit for reinforcement learning using ROS and gazebo,” CoRR, vol. abs/1608.05742, 2016. [Online]. Available: http://arxiv.org/abs/1608.05742
  26. N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), vol. 3, 2004, pp. 2149–2154 vol.3.

Summary

We haven't generated a summary for this paper yet.