Emergence of Chemotactic Strategies with Multi-Agent Reinforcement Learning (2404.01999v1)
Abstract: Reinforcement learning (RL) is a flexible and efficient method for programming micro-robots in complex environments. Here we investigate whether reinforcement learning can provide insights into biological systems when trained to perform chemotaxis. Namely, whether we can learn about how intelligent agents process given information in order to swim towards a target. We run simulations covering a range of agent shapes, sizes, and swim speeds to determine if the physical constraints on biological swimmers, namely Brownian motion, lead to regions where reinforcement learners' training fails. We find that the RL agents can perform chemotaxis as soon as it is physically possible and, in some cases, even before the active swimming overpowers the stochastic environment. We study the efficiency of the emergent policy and identify convergence in agent size and swim speeds. Finally, we study the strategy adopted by the reinforcement learning algorithm to explain how the agents perform their tasks. To this end, we identify three emerging dominant strategies and several rare approaches taken. These strategies, whilst producing almost identical trajectories in simulation, are distinct and give insight into the possible mechanisms behind which biological agents explore their environment and respond to changing conditions.
- Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38, 2017. doi: 10.1109/MSP.2017.2743240.
- Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13(5):834–846, 1983. doi: 10.1109/TSMC.1983.6313077.
- Bacterial biohybrid microswimmers. Frontiers in Robotics and AI, 5, 2018. ISSN 2296-9144. doi: 10.3389/frobt.2018.00097. URL https://www.frontiersin.org/articles/10.3389/frobt.2018.00097.
- H. Berg. coli in motion2004springer. New York, 2004.
- Chemotaxis in escherichia coli analysed by three-dimensional tracking. Nature, 239(5374):500–504, Oct 1972. ISSN 1476-4687. doi: 10.1038/239500a0. URL https://doi.org/10.1038/239500a0.
- JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
- A. Bren and M. Eisenbach. How signals are heard during bacterial chemotaxis: protein-protein interactions in sensory signal propagation. J Bacteriol, 182(24):6865–6873, Dec. 2000.
- On torque and tumbling in swimming escherichia coli. J Bacteriol, 189(5):1756–1764, Dec. 2006.
- S. Datta and D. K. Srivastava. Stokes drag on axially symmetric bodies: a new approach. Proceedings - Mathematical Sciences, 109(4):441–452, Nov 1999. ISSN 0973-7685. doi: 10.1007/BF02838005. URL https://doi.org/10.1007/BF02838005.
- Physics of microswimmers—single particle motion and collective behavior: a review. Reports on Progress in Physics, 78(5):056601, apr 2015. doi: 10.1088/0034-4885/78/5/056601. URL https://dx.doi.org/10.1088/0034-4885/78/5/056601.
- Quorum sensing enhancement of the stress response promotes resistance to quorum quenching and prevents social cheating. ISME J, 9(1):115–125, June 2014.
- Modification of the overlap potential to mimic a linear site–site potential. The Journal of Chemical Physics, 74(6):3316–3319, 03 1981. ISSN 0021-9606. doi: 10.1063/1.441483. URL https://doi.org/10.1063/1.441483.
- Quorum quenching: role in nature and applied developments. FEMS Microbiology Reviews, 40(1):86–116, 10 2015. ISSN 0168-6445. doi: 10.1093/femsre/fuv038. URL https://doi.org/10.1093/femsre/fuv038.
- S. Gronauer and K. Diepold. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 55(2):895–943, Feb 2022. ISSN 1573-7462. doi: 10.1007/s10462-021-09996-w. URL https://doi.org/10.1007/s10462-021-09996-w.
- A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6):1291–1307, 2012. doi: 10.1109/TSMCC.2012.2218595.
- Chemotaxis in escherichia coli: a molecular model for robust precise adaptation. PLoS Comput Biol, 4(1):e1, Nov. 2007.
- Microswimmers learning chemotaxis with genetic algorithms. Proceedings of the National Academy of Sciences, 118(19):e2019683118, 2021. doi: 10.1073/pnas.2019683118. URL https://www.pnas.org/doi/abs/10.1073/pnas.2019683118.
- Flax: A neural network library and ecosystem for JAX, 2023. URL http://github.com/google/flax.
- D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017.
- S. H. Koenig. Brownian motion of an ellipsoid. a correction to perrin’s results. Biopolymers, 14(11):2421–2423, 1975. doi: https://doi.org/10.1002/bip.1975.360141115. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/bip.1975.360141115.
- S. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2):129–137, 1982. doi: 10.1109/TIT.1982.1056489.
- C. Mo and X. Bian. Chemotaxis of sea urchin sperm cells through deep reinforcement learning, 2022. URL https://arxiv.org/abs/2209.07407.
- Reinforcement learning with artificial microswimmers. Science Robotics, 6(52):eabd9285, 2021. doi: 10.1126/scirobotics.abd9285. URL https://www.science.org/doi/abs/10.1126/scirobotics.abd9285.
- Viral dynamics: a model of the effects of size, shape, motion and abundance of single-celled planktonic organisms and other particles. Marine Ecology Progress Series, 89(2/3):103–116, 1992. ISSN 01718630, 16161599. URL http://www.jstor.org/stable/24831780.
- marcomusy/vedo: 2023.5.0, Nov. 2023. URL https://doi.org/10.5281/zenodo.4587871.
- A concise introduction to decentralized POMDPs, volume 1. Springer, 2016.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- R. A. X. Persson. Note: Modification of the Gay-Berne potential for improved accuracy and speed. The Journal of Chemical Physics, 136(22):226101, 06 2012. ISSN 0021-9606. doi: 10.1063/1.4729745. URL https://doi.org/10.1063/1.4729745.
- Biology of the chemotactic response (armitage, jp & lackie, jm eds) 15–34, 1990.
- Proximal policy optimization algorithms, 2017.
- Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018. URL http://incompleteideas.net/book/the-book-2nd.html.
- Policy gradient methods for reinforcement learning with function approximation. In S. Solla, T. Leen, and K. Müller, editors, Advances in Neural Information Processing Systems, volume 12. MIT Press, 1999.
- SwarmRL, 2023a. URL https://github.com/SwarmRL/SwarmRL.
- Environmental effects on emergent strategy in micro-scale multi-agent reinforcement learning, 2023b.
- Real-time imaging of fluorescent flagellar filaments. Journal of bacteriology, 182(10):2793–2801, 2000.
- L. van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008. URL http://jmlr.org/papers/v9/vandermaaten08a.html.
- Making sense of it all: bacterial chemotaxis. Nature Reviews Molecular Cell Biology, 5(12):1024–1037, Dec 2004. ISSN 1471-0080. doi: 10.1038/nrm1524. URL https://doi.org/10.1038/nrm1524.
- N. Watari and R. G. Larson. The hydrodynamics of a run-and-tumble bacterium propelled by polymorphic helical flagella. Biophys J, 98(1):12–17, Jan. 2010.
- Role of repulsive forces in determining the equilibrium structure of simple liquids. The Journal of chemical physics, 54(12):5237–5247, 1971.
- Espresso 4.0 – an extensible software package for simulating soft matter systems. The European Physical Journal Special Topics, 227(14):1789–1816, Mar 2019. ISSN 1951-6401. doi: 10.1140/epjst/e2019-800186-9. URL https://doi.org/10.1140/epjst/e2019-800186-9.