Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Navigation of micro-robot swarms for targeted delivery using reinforcement learning (2306.17598v1)

Published 30 Jun 2023 in cs.RO, cs.AI, and cs.LG

Abstract: Micro robotics is quickly emerging to be a promising technological solution to many medical treatments with focus on targeted drug delivery. They are effective when working in swarms whose individual control is mostly infeasible owing to their minute size. Controlling a number of robots with a single controller is thus important and artificial intelligence can help us perform this task successfully. In this work, we use the Reinforcement Learning (RL) algorithms Proximal Policy Optimization (PPO) and Robust Policy Optimization (RPO) to navigate a swarm of 4, 9 and 16 microswimmers under hydrodynamic effects, controlled by their orientation, towards a circular absorbing target. We look at both PPO and RPO performances with limited state information scenarios and also test their robustness for random target location and size. We use curriculum learning to improve upon the performance and demonstrate the same in learning to navigate a swarm of 25 swimmers and steering the swarm to exemplify the manoeuvring capabilities of the RL model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. J. Li, B. Esteban-Fernández de Ávila, W. Gao, L. Zhang, and J. Wang, “Micro/nanorobots for biomedicine: Delivery, surgery, sensing, and detoxification,” Sci Robot, vol. 2, no. 4, Mar 2017, pMC6759331. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/31552379
  2. C. P. Goodrich and M. P. Brenner, “Using active colloids as machines to weave and braid on the micrometer scale,” Proceedings of the National Academy of Sciences, vol. 114, no. 2, pp. 257–262, 2017. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/pnas.1608838114
  3. Y. Yang and M. A. Bevan, “Cargo capture and transport by colloidal swarms,” Science Advances, vol. 6, no. 4, p. eaay7679, 2020. [Online]. Available: https://www.science.org/doi/abs/10.1126/sciadv.aay7679
  4. L. Soler, V. Magdanz, V. M. Fomin, S. Sanchez, and O. G. Schmidt, “Self-propelled micromotors for cleaning polluted water,” ACS Nano, vol. 7, no. 11, pp. 9611–9620, Nov 2013. [Online]. Available: https://doi.org/10.1021/nn405075d
  5. L. Chen, H. Yuan, S. Chen, C. Zheng, X. Wu, Z. Li, C. Liang, P. Dai, Q. Wang, X. Ma, and X. Yan, “Cost-effective, high-yield production of biotemplated catalytic tubular micromotors as self-propelled microcleaners for water treatment,” ACS Applied Materials & Interfaces, vol. 13, no. 26, pp. 31 226–31 235, Jul 2021. [Online]. Available: https://doi.org/10.1021/acsami.1c03595
  6. J. Li, I. Rozen, and J. Wang, “Rocket science at the nanoscale,” ACS Nano, vol. 10, no. 6, pp. 5619–5634, Jun 2016. [Online]. Available: https://doi.org/10.1021/acsnano.6b02518
  7. C. Yin, F. Wei, S. Fu, Z. Zhai, Z. Ge, L. Yao, M. Jiang, and M. Liu, “Visible light-driven jellyfish-like miniature swimming soft robot,” ACS Applied Materials & Interfaces, vol. 13, no. 39, pp. 47 147–47 154, Oct 2021. [Online]. Available: https://doi.org/10.1021/acsami.1c13975
  8. A. Daddi-Moussa-Ider, H. Löwen, and B. Liebchen, “Hydrodynamics can determine the optimal route for microswimmer navigation,” Communications Physics, vol. 4, 02 2021.
  9. O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V. Dalibard, D. Budden, Y. Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Wang, T. Pfaff, Y. Wu, R. Ring, D. Yogatama, D. Wünsch, K. McKinney, O. Smith, T. Schaul, T. Lillicrap, K. Kavukcuoglu, D. Hassabis, C. Apps, and D. Silver, “Grandmaster level in starcraft ii using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, Nov 2019. [Online]. Available: https://doi.org/10.1038/s41586-019-1724-z
  10. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb 2015. [Online]. Available: https://doi.org/10.1038/nature14236
  11. J. Kober, J. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, pp. 1238–1274, 09 2013.
  12. M. Breyer, F. Furrer, T. Novkovic, R. Siegwart, and J. Nieto, “Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning,” 2019.
  13. J. Kulhánek, E. Derner, T. de Bruin, and R. Babuška, “Vision-based navigation using deep reinforcement learning,” in 2019 European Conference on Mobile Robots (ECMR), 2019, pp. 1–8.
  14. D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, “Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation,” 2018.
  15. B. Waschneck, A. Reichstaller, L. Belzner, T. Altenmüller, T. Bauernhansl, A. Knapp, and A. Kyek, “Deep reinforcement learning for semiconductor production scheduling,” in 2018 29th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), 2018, pp. 301–306.
  16. X.-Y. Liu, H. Yang, J. Gao, and C. D. Wang, “FinRL: Deep reinforcement learning framework to automate trading in quantitative finance,” ACM International Conference on AI in Finance (ICAIF), 2021.
  17. C. Yu, J. Liu, S. Nemati, and G. Yin, “Reinforcement learning in healthcare: A survey,” ACM Comput. Surv., vol. 55, no. 1, nov 2021. [Online]. Available: https://doi.org/10.1145/3477600
  18. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017.
  19. M. M. Rahman and Y. Xue, “Robust policy optimization in deep reinforcement learning,” 2022. [Online]. Available: https://openreview.net/forum?id=HnLFY8F9uS
  20. Y. Yang, M. Bevan, and B. Li, “Efficient navigation of colloidal robots in an unknown environment via deep reinforcement learning,” Advanced Intelligent Systems, vol. 2, 09 2019.
  21. M. J. Falk, V. Alizadehyazdi, H. Jaeger, and A. Murugan, “Learning to control active matter,” Phys. Rev. Res., vol. 3, p. 033291, Sep 2021. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.3.033291
  22. S. Muiños-Landin, A. Fischer, V. Holubec, and F. Cichos, “Reinforcement learning with artificial microswimmers,” Science Robotics, vol. 6, no. 52, p. eabd9285, 2021.
  23. Y. Yang, M. A. Bevan, and B. Li, “Micro/nano motor navigation and localization via deep reinforcement learning,” Advanced Theory and Simulations, vol. 3, no. 6, p. 2000034, 2020. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/adts.202000034
  24. A. Ghosh and P. Fischer, “Controlled propulsion of artificial magnetic nanostructured propellers,” Nano Letters, vol. 9, no. 6, pp. 2243–2245, Jun 2009. [Online]. Available: https://doi.org/10.1021/nl900186w
  25. M. Pal, I. Fouxon, A. M. Leshansky, and A. Ghosh, “Fluid flow induced by helical microswimmers in bulk and near walls,” Phys. Rev. Res., vol. 4, p. 033069, Jul 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.4.033069
  26. C. Bechinger, R. Di Leonardo, H. L”̈owen, C. Reichhardt, G. Volpe, and G. Volpe, “Active particles in complex and crowded environments,” Rev. Mod. Phys., vol. 88, p. 045006, Nov 2016. [Online]. Available: https://link.aps.org/doi/10.1103/RevModPhys.88.045006
  27. E. M. Purcell, “Life at low Reynolds number,” American Journal of Physics, vol. 45, no. 1, pp. 3–11, 01 1977. [Online]. Available: https://doi.org/10.1119/1.10903
  28. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proceedings of the 32nd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37.   Lille, France: PMLR, 07–09 Jul 2015, pp. 1889–1897. [Online]. Available: https://proceedings.mlr.press/v37/schulman15.html
  29. S. Huang, R. F. J. Dossa, C. Ye, J. Braga, D. Chakraborty, K. Mehta, and J. G. Araújo, “Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms,” Journal of Machine Learning Research, vol. 23, no. 274, pp. 1–18, 2022. [Online]. Available: http://jmlr.org/papers/v23/21-1342.html
  30. Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09.   New York, NY, USA: Association for Computing Machinery, 2009, p. 41–48. [Online]. Available: https://doi.org/10.1145/1553374.1553380

Summary

We haven't generated a summary for this paper yet.