Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 173 tok/s Pro
GPT OSS 120B 438 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Learning Gaussian Representation for Eye Fixation Prediction (2403.14821v1)

Published 21 Mar 2024 in cs.CV

Abstract: Existing eye fixation prediction methods perform the mapping from input images to the corresponding dense fixation maps generated from raw fixation points. However, due to the stochastic nature of human fixation, the generated dense fixation maps may be a less-than-ideal representation of human fixation. To provide a robust fixation model, we introduce Gaussian Representation for eye fixation modeling. Specifically, we propose to model the eye fixation map as a mixture of probability distributions, namely a Gaussian Mixture Model. In this new representation, we use several Gaussian distribution components as an alternative to the provided fixation map, which makes the model more robust to the randomness of fixation. Meanwhile, we design our framework upon some lightweight backbones to achieve real-time fixation prediction. Experimental results on three public fixation prediction datasets (SALICON, MIT1003, TORONTO) demonstrate that our method is fast and effective.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing, 22(1):55–69, 2012.
  2. Attention based on information maximization. Journal of Vision, 7(9):950–950, 2007.
  3. Saliency, attention, and visual search: An information theoretic approach. Journal of vision, 9(3):5–5, 2009.
  4. What do different evaluation metrics tell us about saliency models? IEEE transactions on pattern analysis and machine intelligence, 41(3):740–757, 2018.
  5. A deep multi-level network for saliency prediction. In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 3488–3493. IEEE, 2016.
  6. Predicting human eye fixations via an lstm-based saliency attentive model. IEEE Transactions on Image Processing, 27(10):5142–5154, 2018.
  7. Visual saliency prediction using a mixture of deep neural networks. IEEE Transactions on Image Processing, 27(8):4080–4090, 2018.
  8. Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of vision, 13(4):11–11, 2013.
  9. How much time do you have? modeling multi-duration saliency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4473–4482, 2020.
  10. Context-aware saliency detection. IEEE transactions on pattern analysis and machine intelligence, 34(10):1915–1926, 2011.
  11. An improved salbayes model with gmm. In International Conference on Computer Analysis of Images and Patterns, pages 356–363. Springer, 2011.
  12. Saliency-aware video compression. IEEE Transactions on Image Processing, 23(1):19–33, 2013.
  13. Graph-based visual saliency. In Advances in neural information processing systems, pages 545–552, 2007.
  14. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  15. Searching for mobilenetv3. In Proceedings of the IEEE International Conference on Computer Vision, pages 1314–1324, 2019.
  16. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis & Machine Intelligence, (11):1254–1259, 1998.
  17. End-to-end saliency mapping via probability distribution prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5753–5761, 2016.
  18. Eml-net: An expandable multi-layer network for saliency prediction. Image and Vision Computing, page 103887, 2020.
  19. Salicon: Saliency in context. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1072–1080, 2015.
  20. Learning to predict where humans look. In 2009 IEEE 12th international conference on computer vision, pages 2106–2113. IEEE, 2009.
  21. Shifts in selective visual attention: Towards the underlying neural circuitry. 4:219–27, 02 1985.
  22. Higher-order occurrence pooling for bags-of-words: Visual concept detection. IEEE transactions on pattern analysis and machine intelligence, 39(2):313–326, 2016.
  23. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
  24. Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing, 26(9):4446–4456, 2017.
  25. Deep gaze i: Boosting saliency prediction with feature maps trained on imagenet. arXiv preprint arXiv:1411.1045, 2014.
  26. Deepgaze ii: Reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563, 2016.
  27. Understanding low-and high-level contributions to fixation prediction. In Proceedings of the IEEE International Conference on Computer Vision, pages 4789–4798, 2017.
  28. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
  29. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
  30. A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Transactions on Image Processing, 27(7):3264–3274, 2018.
  31. Predicting eye fixations using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 362–370, 2015.
  32. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131, 2018.
  33. Salgan: Visual saliency prediction with generative adversarial networks. arXiv preprint arXiv:1701.01081, 2017.
  34. Shallow and deep convolutional networks for saliency prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 598–606, 2016.
  35. Dense saliency-based spatiotemporal feature points for action recognition. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 1454–1461. Ieee, 2009.
  36. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
  37. Learning dynamic gmm for attention distribution on single-face videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 29–38, 2017.
  38. Douglas A Reynolds. Gaussian mixture models. Encyclopedia of biometrics, 741, 2009.
  39. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  40. Sony. Sony’s first image sensor with AI processing, 2020. www.youtube.com/watch?v=VycSn3xrMcA&feature=emb_title.
  41. Christopher Lee Thomas. Opensalicon: An open source implementation of the salicon saliency model. Technical Report TR-2016-02, University of Pittsburgh, 2016.
  42. Modeling visual attention via selective tuning. Artificial intelligence, 78(1-2):507–545, 1995.
  43. A gaussian mixture model layer jointly optimized with discriminative features within a deep neural network architecture. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4270–4274. IEEE, 2015.
  44. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2798–2805, 2014.
  45. Attentional selection for object recognition—a gentle way. In International workshop on biologically motivated computer vision, pages 472–479. Springer, 2002.
  46. Deep visual attention prediction. IEEE Transactions on Image Processing, 27(5):2368–2378, 2017.
  47. Salient object detection driven by fixation prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1711–1720, 2018.
  48. Learning to predict saliency on face images. In Proceedings of the IEEE International Conference on Computer Vision, pages 3907–3915, 2015.
  49. Saliency detection in face videos: A data-driven approach. IEEE Transactions on Multimedia, 20(6):1335–1349, 2017.
  50. Efficient highly over-complete sparse coding using a mixture model. In European Conference on Computer Vision, pages 113–126. Springer, 2010.
  51. Multi-mode saliency dynamics model for analyzing gaze and attention. In Proceedings of the Symposium on Eye Tracking Research and Applications, pages 115–122, 2012.
  52. Mit saliency benchmark (2014). [Online]. Available:http://saliency.mit.edu.
  53. Saliency detection: A boolean map approach. In Proceedings of the IEEE international conference on computer vision, pages 153–160, 2013.
  54. Cross-domain medical image translation by shared latent gaussian mixture model. arXiv preprint arXiv:2007.07230, 2020.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: