Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping (2309.16515v3)

Published 28 Sep 2023 in cs.CV

Abstract: Humans are able to segment images effortlessly without supervision using perceptual grouping. Here, we propose a counter-intuitive computational approach to solving unsupervised perceptual grouping and segmentation: that they arise because of neural noise, rather than in spite of it. We (1) mathematically demonstrate that under realistic assumptions, neural noise can be used to separate objects from each other; (2) that adding noise in a DNN enables the network to segment images even though it was never trained on any segmentation labels; and (3) that segmenting objects using noise results in segmentation performance that aligns with the perceptual grouping phenomena observed in humans, and is sample-efficient. We introduce the Good Gestalt (GG) datasets -- six datasets designed to specifically test perceptual grouping, and show that our DNN models reproduce many important phenomena in human perception, such as illusory contours, closure, continuity, proximity, and occlusion. Finally, we (4) show that our model improves performance on our GG datasets compared to other tested unsupervised models by $24.9\%$. Together, our results suggest a novel unsupervised segmentation method requiring few assumptions, a new explanation for the formation of perceptual grouping, and a novel potential benefit of neural noise.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. Unifying (Machine) Vision via Counterfactual World Modeling. ArXiv pre-print, (arXiv:2306.01828), June 2023. arXiv:2306.01828 [cs] type: article.
  2. The mechanism of stochastic resonance. Journal of Physics A: Mathematical and General, 14(11):L453, November 1981. ISSN 0305-4470. doi: 10.1088/0305-4470/14/11/006.
  3. Mixed Evidence for Gestalt Grouping in Deep Neural Networks. Computational Brain & Behavior, 6(3):438–456, September 2023. ISSN 2522-087X. doi: 10.1007/s42113-023-00169-2.
  4. Iterative VAE as a predictive brain model for out-of-distribution generalization. ArXiv pre-print, (arXiv:2012.00557), December 2020. arXiv:2012.00557 [cs] type: article.
  5. Deep Problems with Neural Network Models of Human Vision. Behavioral and Brain Sciences, pp.  1–74, December 2022. ISSN 0140-525X, 1469-1825. doi: 10.1017/S0140525X22002813. Publisher: Cambridge University Press.
  6. Inverse Stochastic Resonance in Cerebellar Purkinje Cells. PLoS Computational Biology, 12(8):e1005000, August 2016. ISSN 1553-734X. doi: 10.1371/journal.pcbi.1005000.
  7. Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1):53–65, 2018. doi: 10.1109/MSP.2017.2765202.
  8. Neuronal Noise. Springer US, 2012.
  9. Crowding reveals fundamental differences in local vs. global processing in humans and machines. Vision Research, 167:39–45, February 2020. ISSN 0042-6989. doi: 10.1016/j.visres.2019.12.006.
  10. The effect of contour closure on the rapid discrimination of two-dimensional shapes. Vision Research, 33(7):981–991, May 1993. ISSN 0042-6989. doi: 10.1016/0042-6989(93)90080-G.
  11. Ecological statistics of Gestalt laws for the perceptual organization of contours. Journal of Vision, 2(4):5, August 2002. ISSN 1534-7362. doi: 10.1167/2.4.5.
  12. GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations. In International Conference on Learning Representations (ICLR), 2020a.
  13. Reconstruction Bottlenecks in Object-Centric Generative Models. ICML Workshop on Object-Oriented Learning, 2020b.
  14. GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement. ArXiv pre-print, (arXiv:2104.09958), January 2022. arXiv:2104.09958 [cs, stat] type: article.
  15. Contour integration by the human visual system: Evidence for a local “association field”. Vision Research, 33(2):173–193, January 1993. ISSN 0042-6989. doi: 10.1016/0042-6989(93)90156-Q.
  16. Friston, K. The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2):127–138, February 2010. ISSN 1471-0048. doi: 10.1038/nrn2787. Number: 2 Publisher: Nature Publishing Group.
  17. Generalisation in humans and deep neural networks. Advances in Neural Information Processing Systems, 31, 2018.
  18. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. International Conference on Learning Representations, 2019.
  19. Edge co-occurrence in natural images predicts contour grouping performance. Vision Research, 41(6):711–724, March 2001. ISSN 0042-6989. doi: 10.1016/S0042-6989(00)00277-7.
  20. Generative adversarial nets. 27, 2014.
  21. On the Binding Problem in Artificial Neural Networks. ArXiv pre-print, (arXiv:2012.05208), December 2020. arXiv:2012.05208 [cs] type: article.
  22. Functional importance of noise in neuronal information processing. Europhysics Letters, 124(5):50001, December 2018. ISSN 0295-5075. doi: 10.1209/0295-5075/124/50001. Publisher: EDP Sciences, IOP Publishing and Società Italiana di Fisica.
  23. Unsupervised Semantic Segmentation by Distilling Feature Correspondences. ArXiv pre-print, (arXiv:2203.08414), March 2022. arXiv:2203.08414 [cs, stat] type: article.
  24. Crowding, grouping, and object recognition: A matter of appearance. Journal of Vision, 15(6):5, May 2015. ISSN 1534-7362. doi: 10.1167/15.6.5.
  25. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. International Conference on Learning Representations, 2017.
  26. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nature Communications, 12(1):6456, November 2021. ISSN 2041-1723. doi: 10.1038/s41467-021-26751-5. Number: 1 Publisher: Nature Publishing Group.
  27. Hinton, G. How to Represent Part-Whole Hierarchies in a Neural Network. Neural Computation, 35(3):413–452, February 2023. ISSN 0899-7667. doi: 10.1162/neco˙a˙01557.
  28. Early binding of feature pairs for visual perception. Nature Neuroscience, 4(2):127–128, February 2001. ISSN 1546-1726. doi: 10.1038/83945. Number: 2 Publisher: Nature Publishing Group.
  29. Comparing partitions. Journal of Classification, 2(1):193–218, December 1985. ISSN 1432-1343. doi: 10.1007/BF01908075.
  30. An overview of quantitative approaches in Gestalt perception. Vision Research, 126:3–8, September 2016. ISSN 0042-6989. doi: 10.1016/j.visres.2016.06.004.
  31. Kanizsa, G. Organization in Vision: Essays on Gestalt Perception. Praeger, New York, 1979.
  32. Perception of partly occluded objects in infancy. Cognitive Psychology, 15(4):483–524, 1983. ISSN 1095-5623. doi: 10.1016/0010-0285(83)90017-8. Place: Netherlands Publisher: Elsevier Science.
  33. Kimchi, R. Primacy of wholistic processing and global/local paradigm: a critical review. Psychological Bulletin, 112(1):24–38, July 1992. ISSN 0033-2909. doi: 10.1037/0033-2909.112.1.24.
  34. Auto-Encoding Variational Bayes. International Conference on Learning Representations, December 2014. arXiv:1312.6114 [cs, stat] type: article.
  35. Segment Anything. ArXiv pre-print, (arXiv:2304.02643), April 2023. arXiv:2304.02643 [cs] type: article.
  36. Behavioral Stochastic Resonance within the Human Brain. Physical Review Letters, 90(21):218103, May 2003. doi: 10.1103/PhysRevLett.90.218103. Publisher: American Physical Society.
  37. Koffka, K. Perception: An introduction to the Gestalt-theorie. Psychological bulletin, 19(10):531–585, 1922.
  38. A closed curve is much more than an incomplete one: effect of closure in figure-ground segmentation. Proceedings of the National Academy of Sciences, 90(16):7495–7497, August 1993. doi: 10.1073/pnas.90.16.7495. Publisher: Proceedings of the National Academy of Sciences.
  39. Kramer, M. A. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal, 37(2):233–243, 1991. ISSN 1547-5905. doi: 10.1002/aic.690370209.
  40. The whole is equal to the sum of its parts: A probabilistic model of grouping by proximity and similarity in regular patterns. Psychological Review, 115(1):131–154, 2008. ISSN 1939-1471. doi: 10.1037/0033-295X.115.1.131. Place: US Publisher: American Psychological Association.
  41. Spatially-global integration of closed, fragmented contours by finding the shortest-path in a log-polar representation. Vision Research, 126:143–163, September 2016. ISSN 0042-6989. doi: 10.1016/j.visres.2015.06.007.
  42. Lesher, G. W. Illusory contours: Toward a neurally based perceptual theory. Psychonomic Bulletin & Review, 2(3):279–321, September 1995. ISSN 1531-5320. doi: 10.3758/BF03210970.
  43. Learning long-range spatial dependencies with horizontal gated recurrent units. Advances in Neural Information Processing Systems, 31, 2018.
  44. Deep Learning Face Attributes in the Wild. Proceedings of International Conference on Computer Vision (ICCV), December 2015.
  45. Object-Centric Learning with Slot Attention. Advances in Neural Information Processing Systems, 33:11525–11538, 2020.
  46. Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
  47. Complex-Valued Autoencoders for Object Discovery. Transactions on Machine Learning Research, September 2022. ISSN 2835-8856.
  48. Towards Deep Learning Models Resistant to Adversarial Attacks. International Conference on Learning Representations, 2018.
  49. Signal Detection in Noisy Weakly-Active Dendrites. Advances in Neural Information Processing Systems, 11, 1998.
  50. Gestalt Perceptual Organization of Visual Stimuli Captures Attention Automatically: Electrophysiological Evidence. Frontiers in Human Neuroscience, 10, 2016. ISSN 1662-5161.
  51. Marino, J. Predictive coding, variational autoencoders, and biological connections. Neural Computation, 34(1):1–44, 2022.
  52. The free energy principle for perception and action: A deep learning perspective. Entropy, 24(2):301, feb 2022. doi: 10.3390/e24020301.
  53. The benefits of noise in neural systems: bridging theory and experiment. Nature Reviews Neuroscience, 12(7):415–425, July 2011. ISSN 1471-0048. doi: 10.1038/nrn3061. Number: 7 Publisher: Nature Publishing Group.
  54. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiv pre-print, (arXiv:1802.03426), 2018. arXiv:1802.03426 [cs, stat] type: article.
  55. Neural Noise Can Explain Expansive, Power-Law Nonlinearities in Neural Response Functions. Journal of Neurophysiology, 87(2):653–659, February 2002. ISSN 0022-3077. doi: 10.1152/jn.00425.2001. Publisher: American Physiological Society.
  56. Milner, P. M. A model for visual shape recognition. Psychological Review, 81(6):521–535, 1974. ISSN 1939-1471. doi: 10.1037/h0037149. Place: US Publisher: American Psychological Association.
  57. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
  58. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  59. Plaut, E. From principal subspaces to principal components with linear autoencoders. arXiv preprint arXiv:1804.10253, 2018.
  60. Perception of wholes and of their component parts: Some configural superiority effects. Journal of Experimental Psychology: Human Perception and Performance, 3(3):422–435, 1977. ISSN 1939-1277. doi: 10.1037/0096-1523.3.3.422. Place: US Publisher: American Psychological Association.
  61. Grouping by Proximity or Similarity? Competition between the Gestalt Principles in Vision. Perception, 27(4):417–430, April 1998. ISSN 0301-0066. doi: 10.1068/p270417. Publisher: SAGE Publications Ltd STM.
  62. Taming VAEs. ArXiv pre-print, (arXiv:1810.00597), October 2018. arXiv:1810.00597 [cs, stat] type: article.
  63. Spatial and Temporal Properties of Illusory Contours and Amodal Boundary Completion. Vision Research, 36(19):3037–3050, October 1996. ISSN 0042-6989. doi: 10.1016/0042-6989(96)00062-4.
  64. Roelfsema, P. R. Cortical Algorithms for Perceptual Grouping. Annual Review of Neuroscience, 29(1):203–227, 2006. doi: 10.1146/annurev.neuro.29.051605.112939. _eprint: https://doi.org/10.1146/annurev.neuro.29.051605.112939.
  65. Variational autoencoders pursue pca directions (by accident). 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  12398–12407, 2019. doi: 10.1109/CVPR.2019.01269.
  66. High-resolution image synthesis with latent diffusion models. pp.  10684–10695, June 2022.
  67. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp.  234–241, 2015. doi: 10.1007/978-3-319-24574-4˙28.
  68. Dynamic Routing Between Capsules. Advances in Neural Information Processing Systems, 30, 2017.
  69. On a common circle: Natural scenes and Gestalt rules. Proceedings of the National Academy of Sciences, 98(4):1935–1940, February 2001. doi: 10.1073/pnas.98.4.1935. Publisher: Proceedings of the National Academy of Sciences.
  70. Deep unsupervised learning using nonequilibrium thermodynamics. 37:2256–2265, 07–09 Jul 2015.
  71. Spelke, E. S. Principles of object perception. Cognitive Science, 14(1):29–56, January 1990. ISSN 0364-0213. doi: 10.1016/0364-0213(90)90025-R.
  72. Contrastive Training of Complex-Valued Autoencoders for Object Discovery. ArXiv pre-print, (arXiv:2305.15001), May 2023. arXiv:2305.15001 [cs] type: article.
  73. Neuronal variability: noise or part of the signal? Nature Reviews Neuroscience, 6(5):389–397, May 2005. ISSN 1471-0048. doi: 10.1038/nrn1668. Number: 5 Publisher: Nature Publishing Group.
  74. Todorovic, D. Gestalt principles. Scholarpedia, 3(12):5345, December 2008. ISSN 1941-6016. doi: 10.4249/scholarpedia.5345.
  75. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure–ground organization. Psychological Bulletin, 138(6):1172–1217, 2012a. ISSN 1939-1455. doi: 10.1037/a0029333. Place: US Publisher: American Psychological Association.
  76. A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. Psychological Bulletin, 138(6):1218–1252, 2012b. ISSN 1939-1455. doi: 10.1037/a0029334. Place: US Publisher: American Psychological Association.
  77. Perceptual Grouping without Awareness: Superiority of Kanizsa Triangle in Breaking Interocular Suppression. PLOS ONE, 7(6):e40106, June 2012. ISSN 1932-6203. doi: 10.1371/journal.pone.0040106. Publisher: Public Library of Science.
  78. Freesolo: Learning to segment objects without annotations. 2022.
  79. Wertheimer, M. Laws of Organization in Perceptual Forms. Psycologische Forschung, 4:301–350, 1923.
  80. Demystifying inductive biases for β𝛽\betaitalic_β-vae based architectures. ArXiv, abs/2102.06822, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.