Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Path Following and Stabilisation of a Bicycle Model using a Reinforcement Learning Approach (2407.17156v1)

Published 24 Jul 2024 in cs.LG and cs.RO

Abstract: Over the years, complex control approaches have been developed to control the motion of a bicycle. Reinforcement Learning (RL), a branch of machine learning, promises easy deployment of so-called agents. Deployed agents are increasingly considered as an alternative to controllers for mechanical systems. The present work introduces an RL approach to do path following with a virtual bicycle model while simultaneously stabilising it laterally. The bicycle, modelled as the Whipple benchmark model and using multibody system dynamics, has no stabilisation aids. The agent succeeds in both path following and stabilisation of the bicycle model exclusively by outputting steering angles, which are converted into steering torques via a PD controller. Curriculum learning is applied as a state-of-the-art training strategy. Different settings for the implemented RL framework are investigated and compared to each other. The performance of the deployed agents is evaluated using different types of paths and measurements. The ability of the deployed agents to do path following and stabilisation of the bicycle model travelling between 2m/s and 7m/s along complex paths including full circles, slalom manoeuvres, and lane changes is demonstrated. Explanatory methods for machine learning are used to analyse the functionality of a deployed agent and link the introduced RL approach with research in the field of bicycle dynamics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Data-driven simulation for general-purpose multibody dynamics using Deep Neural Networks. Multibody System Dynamics. 2021;51:419-54. doi:10.1007/s11044-020-09772-8.
  2. Multibody Models Generated from Natural Language. Multibody System Dynamics. 2024. Epub ahead of print. doi:10.1007/s11044-023-09962-0.
  3. Multibody dynamics and control using machine learning. Multibody System Dynamics. 2023;58:397-431. doi:10.1007/s11044-023-09884-x.
  4. Zai A, Brown B. Deep Reinforcement Learning in Action. Manning Publications; 2020.
  5. Linearized dynamics equations for the balance and steer of a bicycle: A benchmark and review. Proc R Soc A. 2007 08;463:1955-82. doi:10.1098/rspa.2007.1857.
  6. A Bicycle Can Be Self-Stable Without Gyroscopic or Caster Effects. Science. 2011;332(6027):339-42. doi:10.1126/science.1201959.
  7. Schwab AL, Meijaard JP. A review on bicycle dynamics and rider control. Vehicle System Dynamics. 2013;51:1059-90. doi:10.1080/00423114.2013.793365.
  8. Yin S, Yamakita M. Passive velocity field control approach to bicycle robot path following. In: 2016 55th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE); 2016. p. 1654-9. doi:10.1109/SICE.2016.7749208.
  9. Passivity-based trajectory tracking control for an autonomous bicycle. In: IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society; 2018. p. 2607-12. doi:10.1109/IECON.2018.8591382.
  10. Shafiei MH, Emami M. Design of a robust path tracking controller for an unmanned bicycle with guaranteed stability of roll dynamics. Systems Science & Control Engineering. 2019;7(1):12-9. doi:10.1080/21642583.2018.1555062.
  11. Development and control of a bicycle robot based on steering and pendulum balancing. Mechatronics. 2020;69. doi:10.1016/j.mechatronics.2020.102386.
  12. Trajectory tracking and stabilisation of a riderless bicycle. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC); 2021. p. 1859-66. doi:10.1109/ITSC48978.2021.9564958.
  13. Learning-based trajectory tracking and balance control for bicycle robots with a pendulum: A Gaussian process approach. IEEE/ASME Transactions on Mechatronics. 2022;27(2):634-44. doi:10.1109/TMECH.2022.3140885.
  14. Randlov J, Alstrøm P. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping. In: Proceedings of the Fifteenth International Conference on Machine Learning. ICML ’98. Association for Computing Machinery; 1998. p. 463-71.
  15. Le TP, Chung T. Controlling bicycle using deep deterministic policy gradient algorithm. In: 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI); 2017. p. 413-7. doi:10.1109/URAI.2017.7992765.
  16. Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms. Symmetry. 2019;11(2). doi:10.3390/sym11020290.
  17. Cook M. It takes two neurons to ride a bicycle. In: Advances in Neural Information Processing Systems 17 (NIPS 2014) (demo); 2004. Available from: https://paradise.caltech.edu/cook/papers/TwoNeurons.pdf.
  18. Natural Residual Reinforcement Learning for Bicycle Robot Control. In: 2021 IEEE International Conference on Mechatronics and Automation (ICMA); 2021. p. 1201-6. doi:10.1109/ICMA52036.2021.9512587.
  19. Whipple FJ. The stability of the motion of a bicycle. Quarterly Journal of Pure and Applied Mathematics. 1899;30(120):312-48. Available from: http://ruina.tam.cornell.edu/research/topics/bicycle_mechanics/Whipple.pdf.
  20. Stability analysis for the Whipple bicycle dynamics. Multibody System Dynamics. 2020 03;48:311-35. doi:10.1007/s11044-019-09707-y.
  21. Experimental validation of a model of an uncontrolled bicycle. Multibody System Dynamics. 2008;19:115-32. doi:10.1007/s11044-007-9050-x.
  22. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. ArXiv preprint. 2018. Preprint. doi:10.48550/arXiv.1801.01290.
  23. Gerstmayr J. Exudyn – A C++ based Python package for flexible multibody systems. Multibody System Dynamics. 2023;60:533-61. doi:10.1007/s11044-023-09937-1.
  24. OpenAI Gym. ArXiv preprint. 2016. Preprint. doi:10.48550/arXiv.1606.01540.
  25. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research. 2021;22(268):1-8. Available from: http://jmlr.org/papers/v22/20-1364.html.
  26. Bicycle dynamics and control: adapted bicycles for education and research. IEEE Control Systems Magazine. 2005;25(4):26-47. doi:10.1109/MCS.2005.1499389.
  27. Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning. ICML ’09. Association for Computing Machinery; 2009. p. 41-8. doi:10.1145/1553374.1553380.
  28. Asynchronous methods for deep reinforcement learning. In: Proceedings of The 33rd International Conference on Machine Learning; 2016. p. 1928-37. doi:10.48550/arXiv.1602.01783.
  29. Proximal Policy Optimization Algorithms. ArXiv preprint. 2017. Preprint. doi:10.48550/arXiv.1707.06347.
  30. Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in Neural Information Processing Systems 30 (NIPS 2017). vol. 30; 2017. Available from: https://papers.nips.cc/paper/2017.
  31. Some recent developments in bicycle dynamics and control. In: Fourth European Conference on Structural Control (4ECSC); 2008. Available from: http://bicycle.tudelft.nl/schwab/Publications/SchwabKooijmanMeijaard2008.pdf.
  32. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI); 2020. p. 737-44. doi:10.1109/SSCI47803.2020.9308468.
  33. Henderson DM. Euler angles, quaternions, and transformation matrices for space shuttle analysis. NASA; 1977. Available from: https://ntrs.nasa.gov/citations/19770019231.
  34. Psiaki ML. Bicycle Stability: A Mathematical And Numerical Analysis [bathesis]. Princeton University; 1979. Available from: http://ruina.tam.cornell.edu/research/topics/bicycle_mechanics/Psiaki_Princeton_thesis.pdf.
  35. Peterson D, Hubbard M. Analysis of the Holonomic Constraint in the Whipple Bicycle Model. In: Estivalet M, Brisson P, editors. The Engineering of Sport 7. vol. 2; 2008. doi:10.1007/978-2-287-99056-4_75.

Summary

We haven't generated a summary for this paper yet.