Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Driver Attention Tracking and Analysis (2404.07122v2)

Published 10 Apr 2024 in cs.CV

Abstract: We propose a novel method to estimate a driver's points-of-gaze using a pair of ordinary cameras mounted on the windshield and dashboard of a car. This is a challenging problem due to the dynamics of traffic environments with 3D scenes of unknown depths. This problem is further complicated by the volatile distance between the driver and the camera system. To tackle these challenges, we develop a novel convolutional network that simultaneously analyzes the image of the scene and the image of the driver's face. This network has a camera calibration module that can compute an embedding vector that represents the spatial configuration between the driver and the camera system. This calibration module improves the overall network's performance, which can be jointly trained end to end. We also address the lack of annotated data for training and evaluation by introducing a large-scale driving dataset with point-of-gaze annotations. This is an in situ dataset of real driving sessions in an urban city, containing synchronized images of the driving scene as well as the face and gaze of the driver. Experiments on this dataset show that the proposed method outperforms various baseline methods, having the mean prediction error of 29.69 pixels, which is relatively small compared to the $1280{\times}720$ resolution of the scene camera.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Real-time distracted driver posture classification. arXiv preprint arXiv:1706.09498, 2017.
  2. Kenneth Alberto Funes Mora and Jean-Marc Odobez. Geometric generative gaze estimation (g3e) for remote rgb-d cameras. In CVPR, 2014.
  3. Deep fusion for 3D gaze estimation from natural face images using multi-stream cnns. IEEE Access, 8:69212–69221, 2020.
  4. Dacheng Tao Baosheng Yu. Heatmap regression via randomized rounding. 2020.
  5. Poseidon: Face-from-depth for driver pose estimation. In CVPR, 2017.
  6. What do different evaluation metrics tell us about saliency models? IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(3):740–757, 2019.
  7. Coco-search18 fixation dataset for predicting goal-directed attention control. Scientific Reports, 11(8776), 2021.
  8. Gaze estimation by exploring two-eye asymmetry. IEEE Transactions on Image Processing, 29:5259–5272, 2020.
  9. Connecting gaze, scene, and attention: Generalized attention estimation via joint modeling of gaze and scene saliency. In ECCV, 2018.
  10. Detecting attended visual targets in video. In CVPR, 2020.
  11. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016.
  12. Driver distraction detection and recognition using rgb-d sensor. arXiv preprint arXiv:1502.00250, 2015.
  13. On performance evaluation of driver hand detection algorithms: Challenges, dataset, and metrics. In 2015 IEEE 18th international conference on intelligent transportation systems, 2015.
  14. Dada: A large-scale benchmark and model for driver attention prediction in accidental scenarios. arXiv preprint arXiv:1912.12148, 2019.
  15. Can your eyes tell me how you think? a gaze directed estimation of the mental activity. In BMVC, 2013.
  16. Eyediap: A database for the development and evaluation of gaze estimation algorithms from rgb and rgb-d cameras. In Proceedings of the Symposium on Eye Tracking Research and Applications, 2014.
  17. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, 2014.
  18. Remote point-of-gaze estimation requiring a single-point calibration for applications with infants. In Proceedings of the 2008 symposium on Eye tracking research & applications, 2008.
  19. Deep residual learning for image recognition. In CVPR, 2016.
  20. Eye Tracking : A Comprehensive Guide to Methods and Measures. Oxford University Press, United Kingdom, 2011.
  21. Gaze360: Physically unconstrained gaze estimation in the wild. In ICCV, 2019.
  22. Gaze in the dark: Gaze estimation in a low-light environment with generative adversarial networks. Sensors, 20(17):4935, 2020.
  23. Eye tracking for everyone. In CVPR, 2016.
  24. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551, 1989.
  25. Believe it or not, we know what you are looking at! In ACCV, 2018.
  26. Dynamics of driver’s gaze: Explorations in behavior modeling and maneuver prediction. IEEE Transactions on Intelligent Vehicles, 3(2):141–150, 2018.
  27. Patch-level gaze distribution prediction for gaze following. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2023.
  28. Tased-net: Temporally-aggregating spatial encoder-decoder network for video saliency detection. In CVPR, 2019.
  29. Gazeformer: Scalable, effective and fast prediction of goal-directed human attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  30. The power is in your hands: 3D analysis of hand gestures in naturalistic video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2013.
  31. Predicting the driver’s focus of attention: the DR (eye)VE project. IEEE TPAMI, 41(7):1720–1733, 2018.
  32. The eye of the typer: a benchmark and analysis of gaze behavior during typing. In Proceedings of ACM Symposium on Eye Tracking Research & Applications, 2018.
  33. Towards end-to-end video-based eye-tracking. In ECCV, 2020.
  34. Adriá Recasens Continente Recasens. Where are they looking? PhD thesis, Massachusetts Institute of Technology, 2016.
  35. Dd-pose-a large-scale driver head pose benchmark. In IEEE Intelligent Vehicles Symposium (IV), pages 927–934, 2019.
  36. Driveahead-a large-scale driver head pose dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017.
  37. Ransac-flow: generic two-stage image alignment. In ECCV, 2020.
  38. Driver fatigue detection systems: A review. IEEE Transactions on Intelligent Transportation Systems, 20(6):2339–2352, 2018.
  39. Gaze locking: passive eye contact detection for human-object interaction. In Proceedings of the 26th annual ACM symposium on User interface software and technology, 2013.
  40. Learning-by-synthesis for appearance-based 3d gaze estimation. In CVPR, 2014.
  41. Efficientdet: Scalable and efficient object detection. In CVPR, 2020.
  42. Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821, 2020.
  43. Accurate eye center location and tracking using isophote curvature. In CVPR, 2008.
  44. Ronda Venkateswarlu et al. Eye gaze estimation from a single image of one eye. In ICCV, 2003.
  45. Driver gaze zone estimation using convolutional neural networks: A general framework and ablative analysis. IEEE Transactions on Intelligent Vehicles, 3(3):254–265, 2018.
  46. Learned region sparsity and diversity also predict visual attention. In Advances in Neural Information Processing Systems, 2016.
  47. Training a network to attend like human drivers saves it from common but misleading loss functions. arXiv preprint arXiv:1711.06406, 2017.
  48. Eye gaze tracking using an rgbd camera: a comparison with a rgb solution. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, 2014.
  49. Predicting goal-directed human attention using inverse reinforcement learning. In CVPR, 2020.
  50. Target-absent human attention. In Proceedings of European Conference on Computer Vision (ECCV), 2022.
  51. Predicting goal-directed attention control using inverse-reinforcement learning. Under Review for Neurons, Behavior, Data analysis, and Theory, 2020.
  52. Appearance-based gaze estimation in the wild. In CVPR, 2015.
  53. Mpiigaze: Real-world dataset and deep appearance-based gaze estimation. IEEE TPAMI, 41(1):162–175, 2017.
  54. Monocular free-head 3D gaze tracking with deep learning and geometry constraints. In ICCV, 2017.

Summary

We haven't generated a summary for this paper yet.