Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 69 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 402 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld (2405.16946v1)

Published 27 May 2024 in q-bio.NC and cs.AI

Abstract: How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural networks with in silico computation using a high-density multi-electrode array, we contrasted the learning rate and the performance of these biological systems against time-matched learning from three state-of-the-art deep RL algorithms (i.e., DQN, A2C, and PPO) in the same game environment. This allowed a meaningful comparison between biological neural systems and deep RL. We find that when samples are limited to a real-world time course, even these very simple biological cultures outperformed deep RL algorithms across various game performance characteristics, implying a higher sample efficiency. Ultimately, even when tested across multiple types of information input to assess the impact of higher dimensional data input, biological neurons showcased faster learning than all deep reinforcement learning agents.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Reinforcement learning in artificial and biological systems. 1(3):133–143. ISSN 2522-5839. doi:10.1038/s42256-019-0025-4. URL http://www.nature.com/articles/s42256-019-0025-4.
  2. Building machines that learn and think like people. 40:e253. ISSN 0140-525X, 1469-1825. doi:10.1017/S0140525X16001837. URL https://www.cambridge.org/core/product/identifier/S0140525X16001837/type/journal_article.
  3. Neuroscience-inspired artificial intelligence. 95(2):245–258. ISSN 08966273. doi:10.1016/j.neuron.2017.06.011. URL https://linkinghub.elsevier.com/retrieve/pii/S0896627317305093.
  4. A deep learning framework for neuroscience. 22(11):1761–1770. ISSN 1097-6256, 1546-1726. doi:10.1038/s41593-019-0520-2. URL https://www.nature.com/articles/s41593-019-0520-2.
  5. Direct fit to nature: An evolutionary perspective on biological and artificial neural networks. 105(3):416–434. ISSN 08966273. doi:10.1016/j.neuron.2019.12.002. URL https://linkinghub.elsevier.com/retrieve/pii/S089662731931044X.
  6. In vitro neurons learn and exhibit sentience when embodied in a simulated game-world. Neuron, 2022.
  7. Reinforcement learning: An introduction. MIT press, 2018.
  8. Rainbow: Combining improvements in deep reinforcement learning. ArXiv, abs/1710.02298, 2017.
  9. Deepstack: Expert-level artificial intelligence in heads-up no-limit poker. Science, 356(6337):508–513, 2017.
  10. Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. arxiv. arXiv preprint arXiv:1807.01281, 2018.
  11. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017a.
  12. Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017b.
  13. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018.
  14. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  15. Human learning in atari. 2017.
  16. Gary Marcus. Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631, 2018.
  17. Elizabeth Gibney et al. This ai researcher is trying to ward off a reproducibility crisis. Nature, 577(7788):14–14, 2020.
  18. Overcoming catastrophic forgetting in neural networks. 114(13):3521–3526. ISSN 0027-8424, 1091-6490. doi:10.1073/pnas.1611835114. URL https://pnas.org/doi/full/10.1073/pnas.1611835114.
  19. The fragility of optimized bandit algorithms. URL http://arxiv.org/abs/2109.13595.
  20. Deep reinforcement learning: An overview. In Yaxin Bi, Supriya Kapoor, and Rahul Bhatia, editors, Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016, pages 426–440, Cham, 2018. Springer International Publishing. ISBN 978-3-319-56991-8.
  21. The real climate and transformative impact of ICT: A critique of estimates, trends, and regulations. 2(9):100340. ISSN 2666-3899. doi:10.1016/j.patter.2021.100340. URL https://www.sciencedirect.com/science/article/pii/S2666389921001884.
  22. Theories of error back-propagation in the brain. 23(3):235–250. ISSN 13646613. doi:10.1016/j.tics.2018.12.005. URL https://linkinghub.elsevier.com/retrieve/pii/S1364661319300129.
  23. Mesolimbic dopamine signals the value of work. 19(1):117–126. ISSN 1097-6256, 1546-1726. doi:10.1038/nn.4173. URL http://www.nature.com/articles/nn.4173.
  24. Amygdala and ventral striatum make distinct contributions to reinforcement learning. 92(2):505–517. ISSN 08966273. doi:10.1016/j.neuron.2016.09.025. URL https://linkinghub.elsevier.com/retrieve/pii/S0896627316305840.
  25. Reinforcement learning or active inference? 4(7):e6421. ISSN 1932-6203. doi:10.1371/journal.pone.0006421. URL https://dx.plos.org/10.1371/journal.pone.0006421.
  26. Inferring neural activity before plasticity: A foundation for learning beyond backpropagation. bioRxiv, pages 2022–05, 2022.
  27. An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural computation, 29(5):1229–1262, 2017.
  28. Modular neuronal assemblies embodied in a closed-loop environment: Toward future integration of brains and machines. 6. ISSN 1662-5110. doi:10.3389/fncir.2012.00099. URL http://journal.frontiersin.org/article/10.3389/fncir.2012.00099/abstract.
  29. Spatio-temporal electrical stimuli shape behavior of an embodied cortical network in a goal-directed learning task. 5(3):310–323. ISSN 1741-2560, 1741-2552. doi:10.1088/1741-2560/5/3/004. URL http://stacks.iop.org/1741-2552/5/i=3/a=004?key=crossref.2e55c5e1d3b8c9612fd3ab6762195e65.
  30. Sub-millisecond closed-loop feedback stimulation between arbitrary sets of individual neurons. Frontiers in neural circuits, 6:121, 2013.
  31. An elaborate sweep-stick code in rat barrel cortex. 6(38):eabb7189. ISSN 2375-2548. doi:10.1126/sciadv.abb7189. URL https://www.science.org/doi/10.1126/sciadv.abb7189.
  32. The technology, opportunities and challenges of synthetic biological intelligence. Biotechnology Advances, page 108233, 2023.
  33. Critical dynamics arise during structured information presentation within embodied in vitro neuronal networks. Nature Communications, 14(1):5287, 2023.
  34. A domain-specific supercomputer for training deep neural networks. 63(7):67–78. ISSN 0001-0782. doi:10.1145/3360307. URL https://doi.org/10.1145/3360307.
  35. Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards. arXiv e-prints, art. arXiv:2008.07773, August 2020. doi:10.48550/arXiv.2008.07773.
  36. A learning-based decision tool towards smart energy optimization in the manufacturing process. Systems, 10(5):180, 2022.
  37. Sample-efficient deep reinforcement learning via episodic backward update. Advances in Neural Information Processing Systems, 32, 2019.
  38. Mark Buchanan. Organoids of intelligence. PhD thesis, Nature Publishing Group, 2018.
  39. Visual areas exert feedforward and feedback influences through distinct frequency channels. 85(2):390–401. ISSN 08966273. doi:10.1016/j.neuron.2014.12.018. URL https://linkinghub.elsevier.com/retrieve/pii/S089662731401099X.
  40. Geoffrey Hinton. The forward-forward algorithm: Some preliminary investigations. arXiv preprint arXiv:2212.13345, 2022.
  41. Distributed hierarchical processing in the primate cerebral cortex. Cerebral cortex (New York, NY: 1991), 1(1):1–47, 1991.
  42. John J Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8):2554–2558, 1982.
  43. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience, 2(1):79–87, 1999.
  44. Karl Friston. A theory of cortical responses. Philosophical transactions of the Royal Society B: Biological sciences, 360(1456):815–836, 2005.
  45. Predictive coding and the neural response to predictable stimuli. Journal of Neuroscience, 30(26):8702–8703, 2010.
  46. Deep reinforcement learning in a handful of trials using probabilistic dynamics models. Advances in neural information processing systems, 31, 2018.
  47. Sample-efficient automated deep reinforcement learning. arXiv preprint arXiv:2009.01555, 2020.
  48. Accelerated methods for deep reinforcement learning. arXiv preprint arXiv:1803.02811, 2018.
  49. Organoid intelligence (oi): the new frontier in biocomputing and intelligence-in-a-dish. Frontiers in Science, 1:1017235, 2023.
  50. Toward the neurocomputer: image processing and pattern recognition with neuronal cultures. IEEE Transactions on Biomedical Engineering, 52(3):371–383, 2005.
  51. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38, 2017.
  52. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  53. Dynamic programming and statistical communication theory. Proceedings of the National Academy of Sciences, 43(8):749–751, 1957.
  54. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836, 2016.
  55. Reverse-engineering neural networks to characterize their cost functions. Neural Computation, 32(11):2085–2121, 2020.
  56. Canonical neural networks perform active inference. Communications Biology, 5(1):55, 2022.
  57. Sophisticated inference. Neural Computation, 33(3):713–763, February 2021. ISSN 0899-7667. doi:10.1162/neco_a_01351. URL https://doi.org/10.1162/neco_a_01351.
  58. Planning and navigation as active inference. Biological Cybernetics, 112(4):323–343, 2018. ISSN 1432-0770. doi:10.1007/s00422-018-0753-2. URL https://doi.org/10.1007/s00422-018-0753-2.
  59. Morphogenesis as bayesian inference: A variational approach to pattern formation and control in complex biological systems. Physics of Life Reviews, 33:88–108, 2020. ISSN 1571-0645. doi:https://doi.org/10.1016/j.plrev.2019.06.001. URL https://www.sciencedirect.com/science/article/pii/S1571064519300909.
  60. Learning action-oriented models through active inference. PLOS Computational Biology, 16(4):1–30, 04 2020. doi:10.1371/journal.pcbi.1007805. URL https://doi.org/10.1371/journal.pcbi.1007805.
  61. The discrete and continuous brain: From decisions to movement-and back again. Neural computation, 30(29894658):2319–2347, September 2018. ISSN 0899-7667. doi:10.1162/neco_a_01102. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6115199/.
  62. Takuya Isomura. Active inference leads to bayesian neurophysiology. Neuroscience Research, 175:38–45, 2022. ISSN 0168-0102. doi:https://doi.org/10.1016/j.neures.2021.12.003. URL https://www.sciencedirect.com/science/article/pii/S0168010221002595. Constructive Understanding of Multi-scale Dynamism of Neuropsychiatric Disorders.
  63. William S. Lovejoy. A survey of algorithmic methods for partially observed markov decision processes. Annals of Operations Research, 28(1):47–65, 1991. ISSN 1572-9338. doi:10.1007/BF02055574. URL https://doi.org/10.1007/BF02055574.
  64. A survey of point-based pomdp solvers. Autonomous Agents and Multi-Agent Systems, 27(1):1–51, 2013. ISSN 1573-7454. doi:10.1007/s10458-012-9200-2. URL https://doi.org/10.1007/s10458-012-9200-2.
  65. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1):99–134, 1998. ISSN 0004-3702. doi:https://doi.org/10.1016/S0004-3702(98)00023-X. URL https://www.sciencedirect.com/science/article/pii/S000437029800023X.
  66. Active inference for stochastic control. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 669–680. Springer, 2021.
Citations (1)

Summary

  • The paper finds that biological neuron cultures outperform state-of-the-art deep reinforcement learning algorithms like DQN, A2C, and PPO in sample efficiency within a simulated gameworld.
  • Utilizing the DishBrain system, researchers showed that biological neurons, including human and mouse cortical cells, learned faster and required fewer environmental interactions than tested RL models.
  • The study suggests biological systems' intrinsic plasticity and adaptive traits offer insights for developing more sample-efficient and adaptable artificial learning algorithms, potentially bridging neuroscience and AI.

An Examination of Biological Neurons and Deep Reinforcement Learning in Sample Efficiency

The paper titled "Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld" explores a comparative analysis between in vitro biological neural networks and state-of-the-art deep reinforcement learning (RL) algorithms. Utilizing a simplified simulation of the classic game 'Pong', the researchers employed DishBrain, an advanced system integrating biological neurons with in silico computation through high-density multi-electrode arrays. This comparison focused on the learning rate and sample efficiency of biological systems versus three popular RL algorithms: DQN, A2C, and PPO.

Key Findings

In environments with constrained samples, the paper reveals that biological neural cultures outperform conventional deep RL algorithms across various gameplay metrics. The biological systems not only demonstrated faster learning but also greater sample efficiency, requiring fewer interactions with the environment to achieve commendable performance. Notably, both human cortical cells (HCCs) and mouse cortical cells (MCCs) exhibited superior capabilities in terms of the average hits-per-rally and managed to minimize initial faults like aces more effectively than their RL counterparts.

Methodological Insights

The research introduces several variations of input designs for the RL algorithms to accurately simulate the conditions faced by biological neurons. These include the traditional Image Input, and control conditions mimicking the biological setup such as Paddle and Ball Position Inputs. It was observed that despite reducing the dimensionality of input data, which theoretically should improve the RL algorithm's sample efficiency, the biological systems still maintained an edge in performance. This superior learning trait is attributed to the intrinsic plasticity and adaptive characteristics of neuronal systems.

Moreover, the paper also highlights the disproportionate computational power demands of RL algorithms compared to biological systems, thus emphasizing the efficiency of the latter not only in terms of learning speed but also energy usage.

Implications and Future Directions

The results underscore the potential of biological neural networks as viable learning machines, offering insights that could be translated into more efficient artificial learning systems. While current RL models are adept at achieving high performance in static environments over extended training periods, this paper accentuates the necessity for developing models with improved sample efficiency and adaptability to dynamic contexts.

Biological systems inherently possess mechanisms for rapid adaptability and learning, as evidenced by their neuroplasticity. Future research may explore bio-inspired algorithms that mirror these traits, potentially revolutionizing the AI field. Techniques such as synaptic plasticity, predictive coding, and active inference could offer frameworks for the next generation of RL algorithms, bridging the gap between biological learning and artificial intelligence.

In conclusion, while deep RL algorithms have shown prowess in controlled settings, the adaptability and efficiency inherent in biological neurons present an exciting avenue for further exploration. The intersection of neuroscience and machine learning, as highlighted by this paper, holds significant promise for advancing both theoretical understanding and practical implementations in AI.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 6 posts and received 37 likes.

Youtube Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube