Continuous Time Continuous Space Homeostatic Reinforcement Learning (CTCS-HRRL) : Towards Biological Self-Autonomous Agent (2401.08999v1)
Abstract: Homeostasis is a biological process by which living beings maintain their internal balance. Previous research suggests that homeostasis is a learned behaviour. Recently introduced Homeostatic Regulated Reinforcement Learning (HRRL) framework attempts to explain this learned homeostatic behavior by linking Drive Reduction Theory and Reinforcement Learning. This linkage has been proven in the discrete time-space, but not in the continuous time-space. In this work, we advance the HRRL framework to a continuous time-space environment and validate the CTCS-HRRL (Continuous Time Continuous Space HRRL) framework. We achieve this by designing a model that mimics the homeostatic mechanisms in a real-world biological agent. This model uses the Hamilton-Jacobian BeLLMan Equation, and function approximation based on neural networks and Reinforcement Learning. Through a simulation-based experiment we demonstrate the efficacy of this model and uncover the evidence linked to the agent's ability to dynamically choose policies that favor homeostasis in a continuously changing internal-state milieu. Results of our experiments demonstrate that agent learns homeostatic behaviour in a CTCS environment, making CTCS-HRRL a promising framework for modellng animal dynamics and decision-making.
- Homeostatic reinforcement learning explains foraging strategies. 11th International Symposium on Adaptive Motion of Animals and Machines (AMAM2023), pages 160–161, 2023.
- Yesoda Bhargava. Hrrl simulations GitHub, 2023. URL https://github.com/vagansh/HRRL/tree/main/Simulations.
- A. Dickinson and B. W. Balleine. The role of learning in motivation. Volume 3 of Steven’s Handbook of Experimental Psychology: Learning, Motivation, and Emotion, 3:497–533, 2002.
- K. Doya. Reinforcement learning in continuous time and space. Neural computation, 12(1):219–245, 2000. doi: 10.1162/089976600300015961.
- Having “multiple selves” helps learning agents explore and adapt in complex changing worlds. bioRxiv, 2023. URL https://api.semanticscholar.org/CorpusID:254879274.
- W. Hodos. Progressive ratio as a measure of reward strength. Science, 134:943–944, 1961. doi: 10.1126/science.134.3483.943.
- Clark Hull. Principles of behavior. Appleton Century, 1943.
- Where does value come from? Preprint, 2019. doi: 10.31234/osf.io/rxf7e.
- Mehdi Keramati. A homeostatic reinforcement learning theory and its implications in cocaine addiction. PhD thesis, 2013.
- Homeostatic reinforcement learning for integrating reward collection and physiological stability. Elife, 2014. doi: 10.7554/eLife.04811.
- A reinforcement learning theory for homeostatic regulation. Advances in Neural Information Processing Systems, 24:82–90, 2011. doi: 10.5555/2986459.2986469.
- An adaptive robot motivational system. From Animals to Animats 9, 9th International Conference on Simulation of Adaptive Behavior, pages 346–356, 2006. doi: 10.1007/11840541_29.
- R. Matthew Kretchmar. A synthesis of reinforcement learning and robust control theory. PhD thesis, 2000.
- Cognitive computational neuroscience. Nature Neuroscience, 21:1148–1160, 2018. doi: 10.1038/s41593-018-0210-5.
- Alex Krizhevsky et al. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25:1097–1105, 2012. doi: 10.1145/3065386.
- Jerome Y. Lettvin. The living brain: W. grey walter. w.w. norton & co., inc., new york, 1953, 311 pp., $3.95. Electroencephalography and Clinical Neurophysiology, 6:353–354, 1954. URL https://api.semanticscholar.org/CorpusID:141480422.
- Johann Lussange et al. Modelling stock markets by multi-agent reinforcement learning. Comput Econ, 2020. doi: 10.1007/s10614-020-10038-w.
- K. Man and A. Damasio. Homeostasis and soft robotics in the design of feeling machines. Nat Mach Intell, 1:446–452, 2019. doi: 10.1038/s42256-019-0103-7.
- Laëticia Matignon. Reward function and initial values : Better choices for accelerated goal-directed reinforcement learning. Lecture Notes in Computer Science, Springer, 1(4131):840–849, 2006. doi: doi.org/10.1644/BHE-004.1.
- Volodymyr Mnih et al. Playing atari with deep reinforcement learning. Preprint 1312.5602, 2013.
- Yael Niv. Reinforcement learning in the brain. Journal of Mathematical Psychology, 53:139–154, 2009. doi: 10.1016/j.jmp.2008.12.005.
- R. B. Panerai et al. Effect of carbon dioxide on dynamic cerebral autoregulation measurement. Physiological Measurement, 20:265–275, 1999. doi: 10.1088/0967-3334/20/3/304.
- Ratchada Pattaranit and Hugo Antonius van den Berg. Mathematical models of energy homeostasis. J. R. Soc. Interface, 5:1119–1135, 2008. doi: 10.1098/rsif.2008.0216.
- Comparing different ode modelling approaches for gene regulatory networks. J Theor Biol, 261(4):511–530, 2009. doi: 10.1016/j.jtbi.2009.07.040.
- Clarifying the roles of homeostasis and allostasis in physiological regulation. Psychol Rev, 121(2):225–247, 2014. doi: 10.1037/a0035942.
- Blake A. Richards. A deep learning framework for neuroscience. Nature Neuroscience, 22:1761–1770, 2019. doi: 10.1038/s41593-019-0520-2.
- Reinforcement learning and human behavior. Current Opinion in Neurobiology, 25:93–98, 2014. doi: 10.1016/j.conb.2013.12.004.
- David Silver et al. Mastering the game of go with deep neural networks and tree search. Nature, 529:484–489, 2016. doi: 10.1038/nature16961.
- David Silver et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018. doi: 10.1126/science.aar6404.
- Reinforcer magnitude and progressive ratio responding in the rat: Effects of increased effort, prefeeding, and extinction. Learn motiv, 24(3):303–343, 1993. doi: 10.1006/lmot.1993.1019.
- J. E. R. Staddon. Adaptive behavior and learning. Cambridge University Press, 1983.
- Reinforcement learning: An introduction. http://incompleteideas.net/book/bookdraft2018jan1.pdf, 2018.
- Making deep q-learning methods robust to time discretization. Preprint 1901.09732, 2019.
- Glomerulotubular balance, tubuloglomerular feedback, and salt homeostasis. JASN, 19:2272–2275, 2008. doi: 10.1681/ASN.2007121326.
- F. M. Toates. Motivational systems (problems in the behavioral sciences). Cambridge University Press, 1986.
- Bifurcation study of blood flow control in the kidney. Mathematical biosciences, 263:169–179, 2015. doi: 10.1016/j.mbs.2015.02.015.
- John Wingfield. The concept of allostasis: Coping with a capricious environment. Journal of Mammalogy, 86(2):248–254, 2005. doi: doi.org/10.1644/BHE-004.1.
- Naoto Yoshida. On reward function for survival. ArXiv, abs/1606.05767, 2016. URL https://api.semanticscholar.org/CorpusID:9947618.
- Embodiment perspective of reward definition for behavioural homeostasis. In Deep RL Workshop NeurIPS 2021, 2021. URL https://openreview.net/forum?id=kG_4YfvbCJo.
- Homeostatic reinforcement learning through soft behavior switching with internal body state. 2023 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2023. URL https://api.semanticscholar.org/CorpusID:260386175.