Navigation of micro-robot swarms for targeted delivery using reinforcement learning (2306.17598v1)
Abstract: Micro robotics is quickly emerging to be a promising technological solution to many medical treatments with focus on targeted drug delivery. They are effective when working in swarms whose individual control is mostly infeasible owing to their minute size. Controlling a number of robots with a single controller is thus important and artificial intelligence can help us perform this task successfully. In this work, we use the Reinforcement Learning (RL) algorithms Proximal Policy Optimization (PPO) and Robust Policy Optimization (RPO) to navigate a swarm of 4, 9 and 16 microswimmers under hydrodynamic effects, controlled by their orientation, towards a circular absorbing target. We look at both PPO and RPO performances with limited state information scenarios and also test their robustness for random target location and size. We use curriculum learning to improve upon the performance and demonstrate the same in learning to navigate a swarm of 25 swimmers and steering the swarm to exemplify the manoeuvring capabilities of the RL model.
- J. Li, B. Esteban-Fernández de Ávila, W. Gao, L. Zhang, and J. Wang, “Micro/nanorobots for biomedicine: Delivery, surgery, sensing, and detoxification,” Sci Robot, vol. 2, no. 4, Mar 2017, pMC6759331. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/31552379
- C. P. Goodrich and M. P. Brenner, “Using active colloids as machines to weave and braid on the micrometer scale,” Proceedings of the National Academy of Sciences, vol. 114, no. 2, pp. 257–262, 2017. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/pnas.1608838114
- Y. Yang and M. A. Bevan, “Cargo capture and transport by colloidal swarms,” Science Advances, vol. 6, no. 4, p. eaay7679, 2020. [Online]. Available: https://www.science.org/doi/abs/10.1126/sciadv.aay7679
- L. Soler, V. Magdanz, V. M. Fomin, S. Sanchez, and O. G. Schmidt, “Self-propelled micromotors for cleaning polluted water,” ACS Nano, vol. 7, no. 11, pp. 9611–9620, Nov 2013. [Online]. Available: https://doi.org/10.1021/nn405075d
- L. Chen, H. Yuan, S. Chen, C. Zheng, X. Wu, Z. Li, C. Liang, P. Dai, Q. Wang, X. Ma, and X. Yan, “Cost-effective, high-yield production of biotemplated catalytic tubular micromotors as self-propelled microcleaners for water treatment,” ACS Applied Materials & Interfaces, vol. 13, no. 26, pp. 31 226–31 235, Jul 2021. [Online]. Available: https://doi.org/10.1021/acsami.1c03595
- J. Li, I. Rozen, and J. Wang, “Rocket science at the nanoscale,” ACS Nano, vol. 10, no. 6, pp. 5619–5634, Jun 2016. [Online]. Available: https://doi.org/10.1021/acsnano.6b02518
- C. Yin, F. Wei, S. Fu, Z. Zhai, Z. Ge, L. Yao, M. Jiang, and M. Liu, “Visible light-driven jellyfish-like miniature swimming soft robot,” ACS Applied Materials & Interfaces, vol. 13, no. 39, pp. 47 147–47 154, Oct 2021. [Online]. Available: https://doi.org/10.1021/acsami.1c13975
- A. Daddi-Moussa-Ider, H. Löwen, and B. Liebchen, “Hydrodynamics can determine the optimal route for microswimmer navigation,” Communications Physics, vol. 4, 02 2021.
- O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V. Dalibard, D. Budden, Y. Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Wang, T. Pfaff, Y. Wu, R. Ring, D. Yogatama, D. Wünsch, K. McKinney, O. Smith, T. Schaul, T. Lillicrap, K. Kavukcuoglu, D. Hassabis, C. Apps, and D. Silver, “Grandmaster level in starcraft ii using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, Nov 2019. [Online]. Available: https://doi.org/10.1038/s41586-019-1724-z
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb 2015. [Online]. Available: https://doi.org/10.1038/nature14236
- J. Kober, J. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, pp. 1238–1274, 09 2013.
- M. Breyer, F. Furrer, T. Novkovic, R. Siegwart, and J. Nieto, “Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning,” 2019.
- J. Kulhánek, E. Derner, T. de Bruin, and R. Babuška, “Vision-based navigation using deep reinforcement learning,” in 2019 European Conference on Mobile Robots (ECMR), 2019, pp. 1–8.
- D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, “Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation,” 2018.
- B. Waschneck, A. Reichstaller, L. Belzner, T. Altenmüller, T. Bauernhansl, A. Knapp, and A. Kyek, “Deep reinforcement learning for semiconductor production scheduling,” in 2018 29th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), 2018, pp. 301–306.
- X.-Y. Liu, H. Yang, J. Gao, and C. D. Wang, “FinRL: Deep reinforcement learning framework to automate trading in quantitative finance,” ACM International Conference on AI in Finance (ICAIF), 2021.
- C. Yu, J. Liu, S. Nemati, and G. Yin, “Reinforcement learning in healthcare: A survey,” ACM Comput. Surv., vol. 55, no. 1, nov 2021. [Online]. Available: https://doi.org/10.1145/3477600
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017.
- M. M. Rahman and Y. Xue, “Robust policy optimization in deep reinforcement learning,” 2022. [Online]. Available: https://openreview.net/forum?id=HnLFY8F9uS
- Y. Yang, M. Bevan, and B. Li, “Efficient navigation of colloidal robots in an unknown environment via deep reinforcement learning,” Advanced Intelligent Systems, vol. 2, 09 2019.
- M. J. Falk, V. Alizadehyazdi, H. Jaeger, and A. Murugan, “Learning to control active matter,” Phys. Rev. Res., vol. 3, p. 033291, Sep 2021. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.3.033291
- S. Muiños-Landin, A. Fischer, V. Holubec, and F. Cichos, “Reinforcement learning with artificial microswimmers,” Science Robotics, vol. 6, no. 52, p. eabd9285, 2021.
- Y. Yang, M. A. Bevan, and B. Li, “Micro/nano motor navigation and localization via deep reinforcement learning,” Advanced Theory and Simulations, vol. 3, no. 6, p. 2000034, 2020. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/adts.202000034
- A. Ghosh and P. Fischer, “Controlled propulsion of artificial magnetic nanostructured propellers,” Nano Letters, vol. 9, no. 6, pp. 2243–2245, Jun 2009. [Online]. Available: https://doi.org/10.1021/nl900186w
- M. Pal, I. Fouxon, A. M. Leshansky, and A. Ghosh, “Fluid flow induced by helical microswimmers in bulk and near walls,” Phys. Rev. Res., vol. 4, p. 033069, Jul 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.4.033069
- C. Bechinger, R. Di Leonardo, H. L”̈owen, C. Reichhardt, G. Volpe, and G. Volpe, “Active particles in complex and crowded environments,” Rev. Mod. Phys., vol. 88, p. 045006, Nov 2016. [Online]. Available: https://link.aps.org/doi/10.1103/RevModPhys.88.045006
- E. M. Purcell, “Life at low Reynolds number,” American Journal of Physics, vol. 45, no. 1, pp. 3–11, 01 1977. [Online]. Available: https://doi.org/10.1119/1.10903
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proceedings of the 32nd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37. Lille, France: PMLR, 07–09 Jul 2015, pp. 1889–1897. [Online]. Available: https://proceedings.mlr.press/v37/schulman15.html
- S. Huang, R. F. J. Dossa, C. Ye, J. Braga, D. Chakraborty, K. Mehta, and J. G. Araújo, “Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms,” Journal of Machine Learning Research, vol. 23, no. 274, pp. 1–18, 2022. [Online]. Available: http://jmlr.org/papers/v23/21-1342.html
- Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09. New York, NY, USA: Association for Computing Machinery, 2009, p. 41–48. [Online]. Available: https://doi.org/10.1145/1553374.1553380