Beyond Average: Individualized Visual Scanpath Prediction
Abstract: Understanding how attention varies across individuals has significant scientific and societal impacts. However, existing visual scanpath models treat attention uniformly, neglecting individual differences. To bridge this gap, this paper focuses on individualized scanpath prediction (ISP), a new attention modeling task that aims to accurately predict how different individuals shift their attention in diverse visual tasks. It proposes an ISP method featuring three novel technical components: (1) an observer encoder to characterize and integrate an observer's unique attention traits, (2) an observer-centric feature integration approach that holistically combines visual features, task guidance, and observer-specific characteristics, and (3) an adaptive fixation prioritization mechanism that refines scanpath predictions by dynamically prioritizing semantic feature maps based on individual observers' attention traits. These novel components allow scanpath models to effectively address the attention variations across different observers. Our method is generally applicable to different datasets, model architectures, and visual tasks, offering a comprehensive tool for transforming general scanpath models into individualized ones. Comprehensive evaluations using value-based and ranking-based metrics verify the method's effectiveness and generalizability.
- Eye gaze techniques for human computer interaction: A research survey. Virtual Reality, 2023.
- SaltiNet: Scan-path prediction on 360 degree images using saliency volumes. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW), 2017.
- PathGAN: Visual scanpath prediction with generative adversarial networks. In Proceedings of the European Conference on Computer Vision Workshop (ECCVW), 2018.
- TempSAL - uncovering temporal information for deep saliency prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology (JMB), 1970.
- CAT2000: A large scale fixation dataset for boosting saliency research. arXiv preprint arXiv:1505.03581v1, 2015.
- Spontaneous eye movements during visual imagery reflect the content of the visual scene. Journal of Cognitive Neuroscience (JCN), 1997.
- Visual attention and applications in multimedia technologies. Proceedings of the Institution of Electrical Engineers, 2013.
- Predicting visual attention in graphic design documents. IEEE Transactions on Multimedia (TMM), 2022.
- Learning the best pooling strategy for visual semantic embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021a.
- Attention-based autism spectrum disorder screening with privileged modality. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.
- AiR: Attention with reasoning capability. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- Learning from unique perspectives: User-aware saliency modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Predicting human scanpaths in visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021b.
- Predicting human eye fixations via an lstm-based saliency attentive model. IEEE Transactions on Image Processing (IEEE TIP), 2018.
- ScanMatch: A novel method for comparing fixation sequences. Behavior Research Methods (BRM), 2010.
- Visual dialog. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Visual dialog. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2019.
- Scanpathnet: A recurrent mixture density network for scanpath prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2022.
- BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- It depends on how you look at it: Scanpath comparison in multiple dimensions with MultiMatch, a vector-based approach. Behavior Research Methods (BRM), 2012.
- A dataset of eye movements for the children with autism spectrum disorder. In ACM Multimedia Systems Conference (MMSys), 2019.
- Wave propagation of visual stimuli in focus of attention. arXiv preprint arXiv:2006.11035, 2020.
- Diagnostic procedures in autism spectrum disorders: a systematic literature review. European Child & Adolescent Psychiatry, 2013.
- Predicting visual importance across graphic design types. In ACM Symposium on User Interface Software and Technology, 2020.
- What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Journal of Vision (JoV), 2008.
- Saliency-guided quality assessment of screen content images. IEEE Transactions on Multimedia (TMM), 2016.
- “I am going this way”: Gazing eyes on self-driving car show multiple driving directions. In International Conference on Automotive User Interfaces and Interactive Vehicular Applications, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Long short-term memory. Neural Computation, 1997.
- SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
- GQA: A new dataset for real-world visual reasoning and compositional question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Human-computer interaction using eye-gaze input. IEEE Transactions on Systems, Man, and Cybernetics (TSMC), 1989.
- A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 1998.
- Sen Jia and Neil D. B. Bruce. EML-NET:an expandable multi-layer network for saliency prediction. Image and Vision Computing, 2020.
- Learning visual attention to identify people with autism spectrum disorder. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
- SALICON: Saliency in context. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
- Fantastic answers and where to find them: Immersive question-directed visual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
- Predicting core characteristics of asd through facial emotion recognition and eye tracking in youth. In International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2020b.
- UEyes: Understanding visual saliency across user interface types. In ACM CHI Conference on Human Factors in Computing Systems (CHI), 2023.
- Learning to predict where humans look. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013.
- Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), 2015.
- DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563, 2016.
- DeepGaze III: Modeling free-viewing human scanpaths with deep learning. Journal of Vision (JoV), 2022.
- Repetitive behavior disorders in autism. Developmental Disabilities Research Reviews, 1998.
- Individual trait oriented scanpath prediction for visual attention analysis. In IEEE International Conference on Image Processing (ICIP), 2017.
- No-reference quality assessment of deblocked images. Neurocomputing, 2016.
- RoBERTa: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- Few-shot personalized saliency prediction using meta-learning. Image and Vision Computing, 2022.
- Saccadic model of eye movements for free-viewing condition. Vision Research (VR), 2015.
- Gazeformer: Scalable, effective and fast prediction of goal-directed human attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Few-shot personalized saliency prediction based on adaptive image selection considering object and visual attention. IEEE International Conference on Consumer Electronics, 2020.
- Eye movements to natural images as a function of sex and personality. PLoS One, 2012.
- Age-related differences in fixation pattern on a companion robot. Sensors, 2020.
- Towards the use of eye gaze tracking technology: Human computer interaction (hci) research. In African Human-Computer Interaction Conference: Inclusiveness and Empowerment, 2021.
- Individual differences in eye movements during face identification reflect observer-specific optimal points of fixation. Psychological Science, 2013.
- Exploring natural eye-gaze-based interaction for immersive virtual reality. In IEEE Symposium on 3D User Interfaces (3DUI), 2017.
- An eye tracking based virtual reality system for use inside magnetic resonance imaging systems. Scientific Reports, 2021.
- Simulating human visual system based on vision transformer. In Proceedings of the 2023 ACM Symposium on Spatial User Interaction, 2023.
- Self-critical sequence training for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Curious eyes: Individual differences in personality predict eye movement behavior in scene-viewing. Cognition, 2012.
- Gender classification based on eye movements: A processing effect during passive face viewing. Advances in Cognitive Psychology, 2017.
- Gender-based eye movement differences in passive indoor picture viewing: An eye-tracking study. Physiology & Behavior, 2019.
- Eye gaze techniques for human computer interaction: A research survey. International Journal of Computer Applications, 2013.
- Hiroyuki Sogo. Gazeparser: an open-source and multiplatform library for low-cost eye tracking and analysis. Behavior Reserch Methods (BRM), 2013.
- Repetitive behavior profiles in asperger syndrome and high-functioning autism. Journal of Autism and Developmental Disorders, 2005.
- Tommy Strandvall. Eye tracking in human-computer interaction and usability research. In IFIP Conference on Human-Computer Interaction, 2009.
- ScanDMM: A deep markov model of scanpath prediction for 360° images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Visual scanpath prediction using IOR-ROI recurrent mixture density network. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2019.
- Alexander Toet. Computational versus psychophysical bottom-up image saliency: A comparative evaluation study. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2011.
- Atypical visual saliency in autism spectrum disorder quantified through model-based eye tracking. Neuron, 2015.
- Simulating human saccadic scanpaths on natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
- Scanpath estimation based on foveated image saliency. Cognitive Processing (CP), 2017.
- Older adults’ response to color visibility in indoor residential environment using eye-tracking technology. Sensors, 2022.
- Active fixation control to predict saccade sequences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Unified visual-semantic embeddings: Bridging vision and language with structured meaning representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Predicting driver attention in critical situations. In Asian Conference on Computer Vision (ACCV), 2018.
- Periphery-fovea multi-resolution driving model guided by human attention. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2019.
- Predicting human gaze beyond pixels. Journal of Vision (JoV), 2014.
- Beyond universal saliency: Personalized saliency prediction with multi-task cnn. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), 2017.
- Personalized saliency and its prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2018.
- VisualHow: Multimodal problem solving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022a.
- Predicting goal-directed human attention using inverse reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Target-absent human attention. In Proceedings of the European Conference on Computer Vision (ECCV), 2022b.
- Predicting human attention using computational attention. arXiv preprint arXiv:2303.09383, 2023.
- Objects as points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.