Driver Gaze Region Estimation Without Using Eye Movement (1507.04760v2)

Published 16 Jul 2015 in cs.CV

Abstract: Automated estimation of the allocation of a driver's visual attention may be a critical component of future Advanced Driver Assistance Systems. In theory, vision-based tracking of the eye can provide a good estimate of gaze location. In practice, eye tracking from video is challenging because of sunglasses, eyeglass reflections, lighting conditions, occlusions, motion blur, and other factors. Estimation of head pose, on the other hand, is robust to many of these effects, but cannot provide as fine-grained of a resolution in localizing the gaze. However, for the purpose of keeping the driver safe, it is sufficient to partition gaze into regions. In this effort, we propose a system that extracts facial features and classifies their spatial configuration into six regions in real-time. Our proposed method achieves an average accuracy of 91.4% at an average decision rate of 11 Hz on a dataset of 50 drivers from an on-road study.

Citations (64)

View on Semantic Scholar

Summary

The paper details a novel method to estimate driver gaze region using facial features and head pose, achieving 91.4% accuracy with user-specific models.
This technique utilizes facial landmark analysis and a random forest classifier trained on naturalistic driving data to classify gaze into distinct regions.
The approach provides a practical method to improve driver assistance systems and vehicular safety by reliably inferring gaze direction without direct eye tracking.

Driver Gaze Region Estimation Without Using Eye Movement

This paper, authored by Lex Fridman et al., details a novel method for estimating the region of a driver's gaze without relying on direct eye movement tracking. This approach is especially pertinent given the challenges associated with eye tracking in vehicular environments, such as obscuration by sunglasses, varying lighting conditions, and motion blur. The authors argue that while head pose estimation lacks the granularity of eye tracking, it provides sufficient data for regional gaze estimation, which can directly impact driver assistance systems and improve vehicular safety.

Methodology

The authors propose a gaze classification system that utilizes facial feature extraction to determine the spatial configuration and classify it into six distinct gaze regions. The classification algorithm developed for this purpose achieves significant results, with an average accuracy of 91.4% at an average decision rate of 11 Hz, performed on a dataset consisting of 50 drivers. Key aspects of the methodology include:

Face Detection: Employing a histogram of oriented gradients and linear SVM classification to detect faces within video frames.
Feature Extraction and Classification: Utilizing 56 facial landmarks, selected through recursive feature elimination, to achieve spatial configuration which is then normalized and classified using a random forest classifier.
Training and Evaluation: The system was trained and evaluated using a dataset collected from naturalistic driving studies, capturing over several million annotated images of drivers.

Results

The research demonstrates substantial improvement in gaze region classification when using driver-specific models and applying confidence-based decision pruning. Specifically, a user-based model improved classification accuracy from 65% to 91.4%, with marked improvement observed by focusing on confident classification decisions. This indicates a threshold in facial feature configuration that effectively segregates gaze into respective regions.

Practical and Theoretical Implications

The implications of the paper are manifold. Practically, the ability of driver assistance systems to infer gaze direction reliably can enhance safety by detecting driver distraction and encouraging attention to critical driving-related areas. Theoretically, this method provides insight into the correlation between head pose and gaze direction, encouraging further exploration into how facial landmarks relate to ocular movements in dynamic environments.

Future Directions

The paper acknowledges variability in classification accuracy between subjects and regions and suggests further investigation into inter-personal and intra-personal variations in head versus eye movement relationships. Additionally, exploring adaptive learning models that tailor to a user's unique spatial behaviors could enhance system performance.

The development of a concise and efficient pipeline suitable for real-time implementation positions this research as a critical step toward practical deployment in commercial vehicles. Enhanced driver state monitoring leveraging existing camera technologies promises substantial benefits in advancing vehicle safety systems.

Related Papers

YouTube

Show All Videos