Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Psychophysics of Human Three-Dimensional Active Visuospatial Problem-Solving (2306.11756v1)

Published 19 Jun 2023 in q-bio.NC and cs.CV

Abstract: Our understanding of how visual systems detect, analyze and interpret visual stimuli has advanced greatly. However, the visual systems of all animals do much more; they enable visual behaviours. How well the visual system performs while interacting with the visual environment and how vision is used in the real world have not been well studied, especially in humans. It has been suggested that comparison is the most primitive of psychophysical tasks. Thus, as a probe into these active visual behaviours, we use a same-different task: are two physical 3D objects visually the same? This task seems to be a fundamental cognitive ability. We pose this question to human subjects who are free to move about and examine two real objects in an actual 3D space. Past work has dealt solely with a 2D static version of this problem. We have collected detailed, first-of-its-kind data of humans performing a visuospatial task in hundreds of trials. Strikingly, humans are remarkably good at this task without any training, with a mean accuracy of 93.82%. No learning effect was observed on accuracy after many trials, but some effect was seen for response time, number of fixations and extent of head movement. Subjects demonstrated a variety of complex strategies involving a range of movement and eye fixation changes, suggesting that solutions were developed dynamically and tailored to the specific task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. C. R. Bennett, P. J. Bex, C. M. Bauer, and L. B. Merabet, “The Assessment of Visual Function and Functional Vision,” Seminars in Pediatric Neurology, vol. 31, no. Cvi, pp. 30–40, 2019.
  2. P. R. Parker, E. T. Abe, E. S. Leonard, D. M. Martins, and C. M. Niell, “Joint coding of visual input and eye/head position in V1 of freely moving mice,” Neuron, vol. 110, no. 23, pp. 3897–3906.e5, 2022. [Online]. Available: https://doi.org/10.1016/j.neuron.2022.08.029
  3. M. Kadohisa, M. Kusunoki, D. J. Mitchell, C. Bhatia, M. J. Buckley, J. Duncan, M. Kadohisa, M. Kusunoki, D. J. Mitchell, C. Bhatia, M. J. Buckley, and J. Duncan, “Frontal and temporal coding dynamics in successive steps of complex behaviour,” Neuron, pp. 1–14, 2022. [Online]. Available: https://doi.org/10.1016/j.neuron.2022.11.004
  4. A. Martinho and A. Kacelnik, “Ducklings imprint on the relational concept of ”same or different”,” Science, vol. 353, no. 6296, pp. 286–288, 2016.
  5. R. N. Shepard and J. Metzler, “Mental rotation of three-dimensional objects,” Science, vol. 171, no. 3972, pp. 701–703, 1971.
  6. M. D. Solbach and J. K. Tsotsos, “Blocks World Revisited: The Effect of Self-Occlusion on Classification by Convolutional Neural Networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3505–3514.
  7. ——, “PESAO: Psychophysical Experimental Setup for Active Observers,” in arXiv preprint arXiv:2009.09933, 2020, pp. 1–20.
  8. A. T. Taylor, T. A. Berrueta, and T. D. Murphey, “Active learning in robotics: A review of control principles,” Mechatronics, vol. 77, no. May, p. 102576, 2021. [Online]. Available: https://doi.org/10.1016/j.mechatronics.2021.102576
  9. I. Gauthier and M. J. Tarr, “Becoming a ’Greeble’ expert: Exploring mechanisms for face recognition,” Vision Research, vol. 37, no. 12, pp. 1673–1682, 1997.
  10. J. Johnson, L. Fei-Fei, B. Hariharan, C. L. Zitnick, L. Van Der Maaten, and R. Girshick, “CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2901–2910, 2017.
  11. T. Hodaň, P. Haluza, Š. Obdrzalek, J. Matas, M. Lourakis, and X. Zabulis, “T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects,” Proceedings - 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017, pp. 880–888, 2017.

Summary

We haven't generated a summary for this paper yet.