UMDFaces: An Annotated Face Dataset for Training Deep Networks (1611.01484v2)

Published 4 Nov 2016 in cs.CV

Abstract: Recent progress in face detection (including keypoint detection), and recognition is mainly being driven by (i) deeper convolutional neural network architectures, and (ii) larger datasets. However, most of the large datasets are maintained by private companies and are not publicly available. The academic computer vision community needs larger and more varied datasets to make further progress. In this paper we introduce a new face dataset, called UMDFaces, which has 367,888 annotated faces of 8,277 subjects. We also introduce a new face recognition evaluation protocol which will help advance the state-of-the-art in this area. We discuss how a large dataset can be collected and annotated using human annotators and deep networks. We provide human curated bounding boxes for faces. We also provide estimated pose (roll, pitch and yaw), locations of twenty-one key-points and gender information generated by a pre-trained neural network. In addition, the quality of keypoint annotations has been verified by humans for about 115,000 images. Finally, we compare the quality of the dataset with other publicly available face datasets at similar scales.

Authors (5)

Ankan Bansal (15 papers)
Anirudh Nanduri (5 papers)
Carlos Castillo (60 papers)
Rajeev Ranjan (43 papers)
Rama Chellappa (190 papers)

Citations (207)

View on Semantic Scholar

Summary

Overview of "UMDFaces: An Annotated Face Dataset for Training Deep Networks"

The paper "UMDFaces: An Annotated Face Dataset for Training Deep Networks" addresses the significant gap in publicly available, large-scale, annotated face datasets necessary for advancing deep learning models in face detection and recognition. This issue has been particularly challenging for the academic community, given that large datasets held by corporations like Facebook and Google are not publicly available. In response, the authors introduce UMDFaces, a substantial dataset encompassing 367,888 faces across 8,277 subjects, with annotations including bounding boxes, 3D pose estimations, keypoint locations, and gender classification.

Dataset Composition and Annotation

UMDFaces sets itself apart by providing a comprehensive set of annotations verified through a semi-autonomous process incorporating both deep learning models and human reviewers. The authors utilized human annotators and a pre-trained All-in-One CNN model to ensure high-quality annotations. This hybrid methodology not only captures the bounding boxes and key facial landmarks but also provides pose estimation (yaw, pitch, roll) with impressive granularity, covering diverse real-world scenarios across various poses and expressions.

Comparison and Benchmarking

The dataset is positioned against existing face datasets, like CASIA WebFace and VGG Face, with a specific focus on its breadth regarding the number of subjects. It offers ample variability in pose compared to these datasets, making it an excellent asset for developing and evaluating face recognition algorithms that need to perform well across a wider array of conditions. A new face recognition evaluation protocol accompanies the dataset, presenting rigorously structured test scenarios categorized by pose variations (easy, moderate, and difficult tracks) using yaw differences, which emphasizes the robustness of the models trained on UMDFaces.

Experimental Validation

To validate the capability of UMDFaces as a training resource, the authors conduct several key experiments. They demonstrate that a deep convolutional neural network trained on UMDFaces provides superior face verification results compared to networks trained on alternative datasets like CASIA WebFace and pre-trained VGGFace models. The experiments emphasize UMDFaces' ability to cultivate models that perform markedly well, particularly at low false acceptance rates, showcasing its utility in training more generalized and robust face recognition models.

In addition to the verification experiments, the paper highlights the dataset's utility in keypoint detection tasks. By training a simple neural network using annotations from UMDFaces, the model achieved competitive results compared to more complex, state-of-the-art methodologies. This reinforces the quality and potential applications of the dataset in enhancing keypoint detection accuracy.

Implications and Future Prospects

The introduction of UMDFaces signifies a considerable advancement in publicly available data for the computer vision community, equipping researchers with a more diverse and rich dataset to develop more accurate face recognition and detection algorithms. The paper also underscores the potential extension of this work, including refining face recognition algorithms that account for extensive variability in subject pose and expression. Given the scalability and high annotation quality of UMDFaces, future work could further leverage this dataset within multi-task learning frameworks to optimize face-related predictions across various tasks simultaneously.

In conclusion, UMDFaces represents a pivotal resource facilitating the training of state-of-the-art models in face detection and recognition, promising to drive progress in these domains and offering a robust benchmarking platform against which future methods can be evaluated.

PDF Markdown

Related Papers

Find Related Papers