Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields (2303.09803v4)

Published 17 Mar 2023 in q-bio.NC, cs.CV, and eess.IV

Abstract: This paper presents a theory for how geometric image transformations can be handled by a first layer of linear receptive fields, in terms of true covariance properties, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy. Specifically, we develop this theory for a generalized Gaussian derivative model for visual receptive fields, which is derived in an axiomatic manner from first principles, that reflect symmetry properties of the environment, complemented by structural assumptions to guarantee internally consistent treatment of image structures over multiple spatio-temporal scales. It is shown how the studied generalized Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations, implying that a vision system, based on image and video measurements in terms of the receptive fields according to this model, can to first order of approximation handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer. We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (83)
  1. Lessons from deep neural networks for studying the coding principles of biological neural networks. Frontiers in Systems Neuroscience 14, 615129
  2. Riesz networks: Scale invariant neural networks in a single forward pass. arXiv preprint arXiv:2305.04665
  3. Bekkers, E. J. (2020). B-spline CNNs on Lie groups. International Conference on Learning Representations (ICLR 2020)
  4. Size invariance in visual object priming. Journal of Experimental Physiology: Human Perception and Performance 18, 121–133
  5. Blasdel, G. G. (1992). Orientation selectivity, preference and continuity in monkey striate cortex. Journal of Neuroscience 12, 3139–3161
  6. Iso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns. Nature 353, 429–431
  7. Deep problems with neural network models of human vision. Behavioral and Brain Sciences , 1–74
  8. Spatial and temporal properties of cone signals in alert macaque primary visual cortex. Journal of Neuroscience 26, 10826–10846
  9. Spatial receptive field structure of double-opponent cells in macaque V1. Journal of Neurophysiology 125, 843–857
  10. A modern view of the classical receptive field: Linear and non-linear spatio-temporal processing by V1 neurons. In The Visual Neurosciences, eds. L. M. Chalupa and J. S. Werner (MIT Press), vol. 1. 704–719
  11. Receptive field dynamics in the central visual pathways. Trends in Neuroscience 18, 451–457
  12. How does the brain solve visual object recognition? Neuron 73, 415–434
  13. Perceptual learning in object recognition: Object specificity and size invariance. Vision Research 40, 473–484
  14. Geisler, W. S. (2008). Visual perception and the statistical properties of natural scenes. Annual Review of Psychology 59, 10.1–10.26
  15. From filters to features: Scale-space analysis of edge and blur coding in human vision. Journal of Vision 7, 7.1–21
  16. Towards building a more complex view of the lateral geniculate nucleus: Recent advances in understanding its role. Progress in Neurobiology 156, 214–255
  17. A recurrent model of contour integration in primary visual cortex. Journal of Vision 8, 8.1–25
  18. Hartline, H. K. (1938). The response of single optic nerve fibers of the vertebrate eye to illumination of the retina. American Journal of Physiology 121, 400–415
  19. What do deep neural networks tell us about biological vision? Vision Research 198, 108069
  20. Edges and bars: where do people see features in 1-D images? Vision Research 45, 507–525
  21. Receptive fields of single neurones in the cat’s striate cortex. J Physiol 147, 226–238
  22. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160, 106–154
  23. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology 195, 215–243
  24. Brain and Visual Perception: The Story of a 25-Year Collaboration (Oxford University Press)
  25. Fast readout of object indentity from macaque inferior temporal cortex. Science 310, 863–866
  26. Natural Image Statistics: A Probabilistic Approach to Early Computational Vision. Computational Imaging and Vision (Springer)
  27. The dynamics of invariant object recognition in the human visual system. Journal of Neurophysiology 111, 91–102
  28. Size and position invariance of neuronal responses in monkey inferotemporal cortex. Journal of Neurophysiology 73, 218–226
  29. Structured receptive fields in CNNs. In Proc. Computer Vision and Pattern Recognition (CVPR 2016). 2610–2619
  30. Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales. Journal of Mathematical Imaging and Vision 64, 506–536
  31. The orientation selectivity of color-responsive neurons in Macaque V1. The Journal of Neuroscience 28, 8096–8106
  32. An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J. of Neurophysiology 58, 1233–1258
  33. The two-dimensional spatial structure of simple receptive fields in cat striate cortex. J. of Neurophysiology 58, 1187–1211
  34. Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models. eLife 9, e53445
  35. Functional implications of orientation maps in primary visual cortex. Nature Communications 7, 13529
  36. Koenderink, J. J. (1984). The structure of images. Biological Cybernetics 50, 363–370
  37. Representation of local geometry in the visual system. Biological Cybernetics 55, 367–375
  38. Generic neighborhood operators. IEEE Trans. Pattern Analysis and Machine Intell. 14, 597–605
  39. Lindeberg, T. (1998). Feature detection with automatic scale selection. International Journal of Computer Vision 30, 77–116
  40. Lindeberg, T. (2011). Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space. Journal of Mathematical Imaging and Vision 40, 36–81
  41. Lindeberg, T. (2013). A computational theory of visual receptive fields. Biological Cybernetics 107, 589–635
  42. Lindeberg, T. (2016). Time-causal and time-recursive spatio-temporal receptive fields. Journal of Mathematical Imaging and Vision 55, 50–88
  43. Lindeberg, T. (2020). Provably scale-covariant continuous hierarchical networks based on scale-normalized differential expressions coupled in cascade. Journal of Mathematical Imaging and Vision 62, 120–148
  44. Lindeberg, T. (2021). Normative theory of visual receptive fields. Heliyon 7, e05897:1–20. 10.1016/j.heliyon.2021.e05897
  45. Lindeberg, T. (2022). Scale-covariant and scale-invariant Gaussian derivative networks. Journal of Mathematical Imaging and Vision 64, 223–242
  46. Lindeberg, T. (2023a). Orientation selectivity of affine Gaussian derivative based receptive fields. arXiv preprint arXiv:2304.11920
  47. Lindeberg, T. (2023b). A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. Biological Cybernetics 117, 21–59
  48. Scale-space with causal time direction. In Proc. European Conf. on Computer Vision (ECCV’96) (Cambridge, UK), vol. 1064 of Springer LNCS, 229–240
  49. Shape-adapted smoothing in estimation of 3-D shape cues from affine distortions of local 2-D structure. Image and Vision Computing 15, 415–434
  50. Shape representation in the inferior temporal cortex of monkeys. Current Biology 5, 552–563
  51. Efficient sparse coding in early sensory processing: Lessons from signal recovery. PLOS Computational Biology 8, e1002372
  52. Lowe, D. G. (2000). Towards a computational model for object recognition in IT cortex. In Biologically Motivated Computer Vision (Springer), vol. 1811 of Springer LNCS, 20–31
  53. Mallat, S. (2016). Understanding deep convolutional networks. Phil. Trans. Royal Society A 374, 20150203
  54. Marcelja, S. (1980). Mathematical description of the responses of simple cortical cells. Journal of Optical Society of America 70, 1297–1300
  55. Blurred edges look faint, and faint edges look sharp: The effect of a gradient threshold in a multi-scale edge coding model. Vision Research 47, 1705–1720
  56. Neuronal selectivity and local map structure in visual cortex. Neuron 57, 673–679
  57. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Journal of Optical Society of America 381, 607–609
  58. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research 37, 3311–3325
  59. A cascade model of information processing and encoding for retinal prosthesis. Neural Regeneration Research 11, 646
  60. Fully trainable Gaussian derivative convolutional layer. In International Conference on Image Processing (ICIP 2022). 2421–2425
  61. Resolution learning in deep convolutional networks using scale-space theory. IEEE Trans. Image Processing 30, 8342–8353
  62. Visual Cortex and Deep Networks: Learning Invariant Representations (MIT Press)
  63. The generalized Gabor scheme of image representation in biological and machine vision. IEEE Trans. Pattern Analysis and Machine Intell. 10, 452–468
  64. Development of localized oriented receptive fields by learning a translation-invariant code for natural images. Computation in Neural Systems 9, 219–234
  65. Hierarchical models of object recognition in cortex. Nature 2, 1019–1025
  66. Ringach, D. L. (2002). Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. Journal of Neurophysiology 88, 455–463
  67. Ringach, D. L. (2004). Mapping receptive fields in primary visual cortex. Journal of Physiology 558, 717–728
  68. Rodieck, R. W. (1965). Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vision Research 5, 583–601
  69. Scale equivariant U-net. In Proc. British Machine Vision Conference (BMVC 2022)
  70. Natural image statistics and neural representations. Annual Review of Neuroscience 24, 1193–1216
  71. Sensory cortex is optimized for prediction of future input. Elife 7, e31557
  72. DISCO: Accurate discrete scale convolutions. British Machine Vision Conference (BMVC 2021)
  73. How to transform kernels for scale-convolutions. In Proc. International Conference on Computer Vision Workshops (ICCVW 2021). 1092–1097
  74. Scale-equivariant steerable networks. International Conference on Learning Representations (ICLR 2020)
  75. Mach edges: Local features predicted by 3rd derivative spatial filtering. Vision Research 49, 1886–1893
  76. Contour detection in colour images using a neurophysiologically inspired model. Cognitive Computation 8, 1027–1035
  77. Are deep neural networks adequate behavioral models of human visual perception? Annual Review of Vision Science 9
  78. Deep scale-spaces: Equivariance over scale. In Advances in Neural Information Processing Systems (NeurIPS 2019). 7366–7378
  79. Scale-equivariant UNet for histopathology image segmentation. arXiv preprint arXiv:2304.04595
  80. Young, R. A. (1987). The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spatial Vision 2, 273–293
  81. The Gaussian derivative model for spatio-temporal vision: II. Cortical data. Spatial Vision 14, 321–389
  82. The Gaussian derivative model for spatio-temporal vision: I. Cortical model. Spatial Vision 14, 261–319
  83. Scale-translation-equivariant neural networks with decomposed convolutional filters. Journal of Machine Learning Research 23, 1–45
Citations (9)

Summary

We haven't generated a summary for this paper yet.