Reinforcement Learning Controllers for Soft Robots using Learned Environments (2410.18519v2)
Abstract: Soft robotic manipulators offer operational advantage due to their compliant and deformable structures. However, their inherently nonlinear dynamics presents substantial challenges. Traditional analytical methods often depend on simplifying assumptions, while learning-based techniques can be computationally demanding and limit the control policies to existing data. This paper introduces a novel approach to soft robotic control, leveraging state-of-the-art policy gradient methods within parallelizable synthetic environments learned from data. We also propose a safety oriented actuation space exploration protocol via cascaded updates and weighted randomness. Specifically, our recurrent forward dynamics model is learned by generating a training dataset from a physically safe \textit{mean reverting} random walk in actuation space to explore the partially-observed state-space. We demonstrate a reinforcement learning approach towards closed-loop control through state-of-the-art actor-critic methods, which efficiently learn high-performance behaviour over long horizons. This approach removes the need for any knowledge regarding the robot's operation or capabilities and sets the stage for a comprehensive benchmarking tool in soft robotics control.
- C. Laschi, B. Mazzolai, and M. Cianchetti, “Soft robotics: Technologies and systems pushing the boundaries of robot abilities,” Science robotics, vol. 1, no. 1, p. eaah3690, 2016.
- T. George Thuruthel, Y. Ansari, E. Falotico, and C. Laschi, “Control strategies for soft robotic manipulators: A survey,” Soft robotics, vol. 5, no. 2, pp. 149–163, 2018.
- H. Hauser, A. J. Ijspeert, R. M. Füchslin, R. Pfeifer, and W. Maass, “Towards a theoretical foundation for morphological computation with compliant bodies,” Biological cybernetics, vol. 105, pp. 355–370, 2011.
- C. Laschi, T. G. Thuruthel, F. Lida, R. Merzouki, and E. Falotico, “Learning-Based Control Strategies for Soft Robots: Theory, Achievements, and Future Challenges,” IEEE Control Systems Magazine, vol. 43, no. 3, pp. 100–113, June 2023, conference Name: IEEE Control Systems Magazine.
- M. T. Gillespie, C. M. Best, E. C. Townsend, D. Wingate, and M. D. Killpack, “Learning nonlinear dynamic models of soft robots for model predictive control with neural networks,” in 2018 IEEE International Conference on Soft Robotics (RoboSoft). IEEE, 2018, pp. 39–45.
- T. G. Thuruthel, E. Falotico, F. Renda, and C. Laschi, “Model-Based Reinforcement Learning for Closed-Loop Dynamic Control of Soft Robotic Manipulators,” IEEE Transactions on Robotics, vol. 35, no. 1, pp. 124–134, Feb. 2019, conference Name: IEEE Transactions on Robotics.
- ——, “Learning dynamic models for open loop predictive control of soft robotic manipulators,” Bioinspiration & biomimetics, vol. 12, no. 6, p. 066003, 2017.
- C. Alessi, H. Hauser, A. Lucantonio, and E. Falotico, “Learning a controller for soft robotic arms and testing its generalization to new observations, dynamics, and tasks,” in 2023 IEEE International Conference on Soft Robotics (RoboSoft). IEEE, 2023, pp. 1–7.
- F. Piqué, H. T. Kalidindi, L. Fruzzetti, C. Laschi, A. Menciassi, and E. Falotico, “Controlling soft robotic arms using continual learning,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 5469–5476, 2022.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- J. Heek, A. Levskaya, A. Oliver, M. Ritter, B. Rondepierre, A. Steiner, and M. van Zee, “Flax: A neural network library and ecosystem for JAX,” 2023. [Online]. Available: http://github.com/google/flax
- R. DiPietro, C. Rupprecht, N. Navab, and G. D. Hager, “Analyzing and exploiting narx recurrent neural networks for long-term dependencies,” arXiv preprint arXiv:1702.07805, 2017.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.
- H. Van Hasselt, “Reinforcement learning in continuous state and action spaces,” in Reinforcement Learning: State-of-the-Art. Springer, 2012, pp. 207–251.
- M. Hessel, M. Kroiss, A. Clark, I. Kemaev, J. Quan, T. Keck, F. Viola, and H. van Hasselt, “Podracer architectures for scalable reinforcement learning,” arXiv preprint arXiv:2104.06272, 2021.
- C. Lu, J. Kuba, A. Letcher, L. Metz, C. Schroeder de Witt, and J. Foerster, “Discovered policy optimisation,” Advances in Neural Information Processing Systems, vol. 35, pp. 16 455–16 468, 2022.
- J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, et al., “Jax: composable transformations of python+ numpy programs,” 2018.
- S. Satheeshbabu, N. K. Uppalapati, G. Chowdhary, and G. Krishnan, “Open loop position control of soft continuum arm using deep reinforcement learning,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 5133–5139.
- X. You, Y. Zhang, X. Chen, X. Liu, Z. Wang, H. Jiang, and X. Chen, “Model-free control for soft manipulators based on reinforcement learning,” in 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2017, pp. 2909–2915.
- Y. Ansari, M. Manti, E. Falotico, Y. Mollard, M. Cianchetti, and C. Laschi, “Towards the development of a soft manipulator as an assistive robot for personal care of elderly people,” International Journal of Advanced Robotic Systems, vol. 14, no. 2, p. 1729881416687132, 2017.
- M. Rolf, J. J. Steil, and M. Gienger, “Goal babbling permits direct learning of inverse kinematics,” IEEE Transactions on Autonomous Mental Development, vol. 2, no. 3, pp. 216–229, 2010.
- R. T. Lange, “gymnax: A jax-based reinforcement learning environment library, 2022b,” URL http://github. com/RobertTLange/gymnax, 2022.
- Wolfram Demonstrations Project, “Mean reverting random walks,” https://demonstrations.wolfram.com/MeanRevertingRandomWalks/, 2011, accessed: 2024-03-01.
- S. Macenski, T. Foote, B. Gerkey, C. Lalancette, and W. Woodall, “Robot operating system 2: Design, architecture, and uses in the wild,” Science Robotics, vol. 7, no. 66, p. eabm6074, 2022.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.