Retinotopic Mapping Enhances the Robustness of Convolutional Neural Networks (2402.15480v2)
Abstract: Foveated vision, a trait shared by many animals, including humans, has not been fully utilized in machine learning applications, despite its significant contributions to biological visual function. This study investigates whether retinotopic mapping, a critical component of foveated vision, can enhance image categorization and localization performance when integrated into deep convolutional neural networks (CNNs). Retinotopic mapping was integrated into the inputs of standard off-the-shelf convolutional neural networks (CNNs), which were then retrained on the ImageNet task. As expected, the logarithmic-polar mapping improved the network's ability to handle arbitrary image zooms and rotations, particularly for isolated objects. Surprisingly, the retinotopically mapped network achieved comparable performance in classification. Furthermore, the network demonstrated improved classification localization when the foveated center of the transform was shifted. This replicates a crucial ability of the human visual system that is absent in typical convolutional neural networks (CNNs). These findings suggest that retinotopic mapping may be fundamental to significant preattentive visual processes.
- Speeding up the log-polar transform with inexpensive parallel hardware: graphics units and multi-core architectures. Journal of Real-Time Image Processing. 2015; 10(3):533–550. doi: 10.1007/s11554-012-0281-6, 00000.
- Araujo H, Dias JM. An introduction to the log-polar mapping. Proceedings II Workshop on Cybernetic Vision. 1997; (1):139–144. http://ieeexplore.ieee.org/document/629454/, doi: 10.1109/CYBVIS.1996.629454, 00000.
- http://biorxiv.org/lookup/doi/10.1101/2021.06.18.448989, doi: 10.1101/2021.06.18.448989.
- Crouzet SM. What Are the Visual Features Underlying Rapid Object Recognition? Frontiers in Psychology. 2011; 2.
- Fabre-Thorpe M. The Characteristics and Limits of Rapid Visual Categorization. Frontiers in Psychology. 2011; 2.
- Adversarial Attacks on Neural Network Policies. . 2017 Feb; http://arxiv.org/abs/1702.02284, doi: 10.48550/arXiv.1702.02284, arXiv:1702.02284 [cs, stat].
- Spatial Transformer Networks. . 2016 Feb; http://arxiv.org/abs/1506.02025, doi: 10.48550/arXiv.1506.02025, arXiv:1506.02025 [cs].
- Jérémie JN, Perrinet LU. Ultrafast Image Categorization in Biology and Neural Models. Vision. 2023; 2.
- Kaas JH. Topographic Maps are Fundamental to Sensory Processing. Brain Research Bulletin. 1997 Jan; 44(2):107–112. https://www.sciencedirect.com/science/article/pii/S0361923097000944, doi: 10.1016/S0361-9230(97)00094-4.
- Near-optimal combination of disparity across a log-polar scaled visual field. PLoS Computational Biology. 2020; 16(4):e1007699.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Wallach H, Larochelle H, Beygelzimer A, dAlché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32 Curran Associates, Inc.; 2019.p. 8024–8035.
- Polyak SL. The retina. . 1941; .
- Is It an Animal? Is It a Human Face? Fast Processing in Upright and Inverted Natural Scenes. Journal of Vision. 2003; 3:440–455.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV). 2015; 115:211–252.
- Image registration using log-polar transform and phase correlation. In: IEEE Region 10 Annual International Conference, Proceedings/TENCON; 2009. p. 1–5. doi: 10.1109/TENCON.2009.5396234, tex.ids= Sarvaiya2009a.
- Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:14091556 [cs]. 2015; .
- Intriguing properties of neural networks. arXiv preprint arXiv:13126199. 2013; .
- Traver Roig VJ, Bernardino A. A review of log-polar imaging for visual perception in robotics. . 2010; .
- Weinberg RJ. Are Topographic Maps Fundamental to Sensory Processing? Brain Research Bulletin. 1997 Jan; 44(2):113–116. https://www.sciencedirect.com/science/article/pii/S0361923097000956, doi: 10.1016/S0361-9230(97)00095-6.
- Yarbus A. Eye Movements during the Examination of Complicated Objects. Biofizika. 1961; 6(2):52–56.