Automated mapping of virtual environments with visual predictive coding (2308.10913v2)
Abstract: Humans construct internal cognitive maps of their environment directly from sensory inputs without access to a system of explicit coordinates or distance measurements. While machine learning algorithms like SLAM utilize specialized visual inference procedures to identify visual features and construct spatial maps from visual and odometry data, the general nature of cognitive maps in the brain suggests a unified mapping algorithmic strategy that can generalize to auditory, tactile, and linguistic inputs. Here, we demonstrate that predictive coding provides a natural and versatile neural network algorithm for constructing spatial maps using sensory data. We introduce a framework in which an agent navigates a virtual environment while engaging in visual predictive coding using a self-attention-equipped convolutional neural network. While learning a next image prediction task, the agent automatically constructs an internal representation of the environment that quantitatively reflects distances. The internal map enables the agent to pinpoint its location relative to landmarks using only visual information.The predictive coding network generates a vectorized encoding of the environment that supports vector navigation where individual latent space units delineate localized, overlapping neighborhoods in the environment. Broadly, our work introduces predictive coding as a unified algorithmic framework for constructing cognitive maps that can naturally extend to the mapping of auditory, sensorimotor, and linguistic inputs.
- “The Cognitive Map in Humans: Spatial Navigation and Beyond” In Nature Neuroscience 20.11 Nature Publishing Group, 2017, pp. 1504–1513
- Zitong Jerry Wang and Matt Thomson “Localization of signaling receptors maximizes cellular information acquisition in spatially structured natural environments” In Cell Systems 13.7 Elsevier, 2022, pp. 530–546
- David A Sivak and Matt Thomson “Environmental statistics and optimal regulation” In PLoS computational biology 10.9 Public Library of Science San Francisco, USA, 2014, pp. e1003826
- John Anderson “Cognitive Psychology and Its Implications” New York City: Worth Publishers, 2020
- Michael Rescorla “Cognitive maps and the language of thought” In The British Journal for the Philosophy of Science The University of Chicago Press, 2009
- “How to build a cognitive map” In Nature neuroscience 25.10 Nature Publishing Group US New York, 2022, pp. 1257–1272
- Dmitriy Aronov, Rhino Nevers and David W. Tank “Mapping of a Non-Spatial Dimension by the Hippocampal–Entorhinal Circuit” In Nature 543.7647 Nature Publishing Group, 2017, pp. 719–722
- “Geometry of Abstract Learned Knowledge in the Hippocampus” In Nature 595.7865 Nature Publishing Group, 2021, pp. 80–84
- “The Tolman-Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation” In Cell 183.5 Elsevier, 2020, pp. 1249–1263
- “Orbitofrontal cortex as a cognitive map of task space” In Neuron 81.2 Elsevier, 2014, pp. 267–279
- Alexandra O. Constantinescu, Jill X. O’Reilly and Timothy E.J. Behrens “Organizing Conceptual Knowledge in Humans with a Gridlike Code” In Science 352.6292 American Association for the Advancement of Science, 2016, pp. 1464–1468
- Mona M Garvert, Raymond J Dolan and Timothy EJ Behrens “A Map of Abstract Relational Knowledge in the Human Hippocampal–Entorhinal Cortex” In eLife 6 eLife Sciences Publications, Ltd, 2017, pp. e17086
- “Natural Speech Reveals the Semantic Maps That Tile Human Cerebral Cortex” In Nature 532.7600 Nature Publishing Group, 2016, pp. 453–458
- Suzanne Corkin “Lasting Consequences of Bilateral Medial Temporal Lobectomy: Clinical Course and Experimental Findings in H.M.” In Seminars in Neurology 4.02 © 1984 by Thieme Medical Publishers, Inc., 1984, pp. 249–259
- “What is a cognitive map? Organizing knowledge for flexible behavior” In Neuron 100.2 Elsevier, 2018, pp. 490–509
- John O’Keefe “Place Units in the Hippocampus of the Freely Moving Rat” In Experimental Neurology 51.1, 1976, pp. 78–109
- “Microstructure of a Spatial Map in the Entorhinal Cortex” In Nature 436.7052 Nature Publishing Group, 2005, pp. 801–806
- D.G. Amaral, N. Ishizuka and B. Claiborne “Neurons, Numbers and the Hippocampal Network” In Progress in Brain Research 83, 1990, pp. 1–11
- Christopher J. Cueva and Xue-Xin Wei “Emergence of Grid-like Representations by Training Recurrent Neural Networks to Perform Spatial Localization” In arXiv:1803.07770 [cs, q-bio, stat], 2018 arXiv:1803.07770 [cs, q-bio, stat]
- “Vector-Based Navigation Using Grid-like Representations in Artificial Agents” In Nature 557.7705 Nature Publishing Group, 2018, pp. 429–433
- Keenan Crane, Clarisse Weischedel and Max Wardetzky “The Heat Method for Distance Computation” In Communications of the ACM 60.11, 2017, pp. 90–99 DOI: 10.1145/3131280
- “Endotaxis: A Neuromorphic Algorithm for Mapping, Goal-Learning, Navigation, and Patrolling” In eLife 12 eLife Sciences Publications Limited, 2023 DOI: 10.7554/eLife.84141.2
- “The Graph SLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures” In The International Journal of Robotics Research 25.5-6, 2006, pp. 403–429
- Raúl Mur-Artal and Juan D. Tardós “Visual-Inertial Monocular SLAM With Map Reuse” In IEEE Robotics and Automation Letters 2.2, 2017
- Anastasios I. Mourikis and Stergios I. Roumeliotis “A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation” In Proceedings 2007 IEEE International Conference on Robotics and Automation, 2007, pp. 3565–3572
- “Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization”, 2015
- “Cognitive Mapping and Planning for Visual Navigation” arXiv, 2019 arXiv:1702.03920 [cs]
- “Learning to Navigate in Cities Without a Map” In Advances in Neural Information Processing Systems 31 Curran Associates, Inc., 2018
- “RL22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT: Fast Reinforcement Learning via Slow Reinforcement Learning”, 2016
- “DARLA: Improving Zero-Shot Transfer in Reinforcement Learning” In Proceedings of the 34th International Conference on Machine Learning PMLR, 2017, pp. 1480–1490 URL: https://proceedings.mlr.press/v70/higgins17a.html
- “Reinforcement Learning with Action-Free Pre-Training from Videos” In Proceedings of the 39th International Conference on Machine Learning PMLR, 2022, pp. 19561–19579 URL: https://proceedings.mlr.press/v162/seo22a.html
- Tai Sing Lee and David Mumford “Hierarchical Bayesian inference in the visual cortex” In JOSA A 20.7 Optica Publishing Group, 2003, pp. 1434–1448
- David Mumford “Pattern theory: a unifying perspective” In First European Congress of Mathematics: Paris, July 6-10, 1992 Volume I Invited Lectures (Part 1), 1994, pp. 187–224 Springer
- Rajesh P.N. Rao and Dana H. Ballard “Predictive Coding in the Visual Cortex: A Functional Interpretation of Some Extra-Classical Receptive-Field Effects” In Nature Neuroscience 2.1 Nature Publishing Group, 1999, pp. 79–87
- Henri Poincaré “The Foundations of Science: Science and Hypothesis, the Value of Science, Science and Method”, Cambridge Library Collection Cambridge: Cambridge University Press, 2015
- “The Hippocampus as a Cognitive Map” Oxford : New York: Clarendon Press ; Oxford University Press, 1978
- Sebastian Thrun, Wolfram Burgard and Dieter Fox “Probabilistic Robotics”, Intelligent Robotics and Autonomous Agents Cambridge, Mass: MIT Press, 2005
- Kimberly L Stachenfeld, Matthew M Botvinick and Samuel J Gershman “The hippocampus as a predictive map” In Nature neuroscience 20.11 Nature Publishing Group US New York, 2017, pp. 1643–1653
- “Predictive learning as a network mechanism for extracting low-dimensional latent space representations” In Nature communications 12.1 Nature Publishing Group UK London, 2021, pp. 1417
- “Neural learning rules for generating flexible predictions and computing the successor representation” In Elife 12 eLife Sciences Publications Limited, 2023, pp. e80680
- “The Helmholtz Machine” In Neural Computation 7.5, 1995, pp. 889–904 DOI: 10.1162/neco.1995.7.5.889
- Stephen P. Luttrell “A Bayesian Analysis of Self-Organizing Maps” In Neural Computation 6.5, 1994, pp. 767–794 DOI: 10.1162/neco.1994.6.5.767
- Loring W. Tu “Differential Geometry: Connections, Curvature, and Characteristic Classes” New York, NY: Springer, 2017
- “The Malmo Platform for Artificial Intelligence Experimentation” In International Joint Conference on Artificial Intelligence, 2016
- “Deep Residual Learning for Image Recognition” In arXiv:1512.03385 [cs], 2015 arXiv:1512.03385 [cs]
- Olaf Ronneberger, Philipp Fischer and Thomas Brox “U-Net: Convolutional Networks for Biomedical Image Segmentation” arXiv, 2015 arXiv:1505.04597 [cs]
- “Attention Is All You Need” arXiv, 2023 arXiv:1706.03762 [cs]
- “On the Importance of Initialization and Momentum in Deep Learning”
- Leslie N. Smith and Nicholay Topin “Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates” arXiv, 2018 arXiv:1708.07120 [cs, stat]
- Joshua B. Tenenbaum, Vin Silva and John C. Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction” In Science 290.5500 American Association for the Advancement of Science, 2000, pp. 2319–2323
- “Using Grid Cells for Navigation” In Neuron 87.3 Elsevier, 2015, pp. 507–520 DOI: 10.1016/j.neuron.2015.07.006
- “S1 Represents Multisensory Contexts and Somatotopic Locations within and Outside the Bounds of the Cortical Homunculus” In Cell Reports 42.4, 2023, pp. 112312
- “What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior” In Neuron 100.2 Elsevier, 2018, pp. 490–509
- “Language Models Are Few-Shot Learners” In arXiv:2005.14165 [cs], 2020 arXiv:2005.14165 [cs]
- James Gornet (4 papers)
- Matthew Thomson (4 papers)