Environmental variables driving latent dynamics and value estimation

Identify and quantify the specific environmental variables in the ViZDoom-based foraging task that drive the latent neural activity (i.e., the agent’s internal state representation) and the learned value function of the trained agents’ neural networks, using methods from neural population dynamics to isolate these drivers of representation and valuation.

Background

Within the presented deep reinforcement learning framework, agents perceive a 3‑D ViZDoom environment through a biologically inspired vision model and act to maximize survival on a foraging task. The agents’ internal latent states (from feedforward or recurrent architectures) and their learned value functions exhibit structured dynamics that relate to the environment and task demands.

Analyses in the paper show that satiety and time-to-next-consumption explain a substantial portion of the variance in the estimated value for some architectures, but recurrent agents appear to rely on additional, task-relevant latent variables. The authors explicitly list as an open question the need to isolate which environmental variables drive these latent dynamics and value estimates, proposing methods from neural population dynamics as a path forward.

References

In future work we aim to address at least two open questions from this study. Firstly, we aim to better isolate the environmental variables that drive the latent activity and value functions of our agents using methods for analyzing neural population dynamics~\citep{whiteway_quest_2019}.

— A computational approach to visual ecology with deep reinforcement learning (2402.05266 - Sokoloski et al., 7 Feb 2024) in Discussion

Environmental variables driving latent dynamics and value estimation

Background

References

Related Problems