Scaling the dual-stream model to naturalistic images
Determine whether the dual-stream recurrent neural network that integrates foveated glimpse contents (ventral stream) and gaze positions (dorsal stream), learns a spatial target map, and reads out numerosity can be used to help counting in naturalistic images, such as 3D tabletop scenes.
References
One key outstanding question, which we leave for future work, is whether the approach described here could be used to help counting in naturalistic images (e.g. 3D tabletop scenes 69).
— Zero-shot counting with a dual-stream neural network model
(2405.09953 - Thompson et al., 16 May 2024) in Discussion, final paragraph