VGGFace2: A dataset for recognising faces across pose and age (1710.08092v2)

Published 23 Oct 2017 in cs.CV

Abstract: In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS- Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on all the IARPA Janus face recognition benchmarks, e.g. IJB-A, IJB-B and IJB-C, exceeding the previous state-of-the-art by a large margin. Datasets and models are publicly available.

Citations (2,490)

View on Semantic Scholar

Summary

The paper presents VGGFace2, a dataset with 3.31M images and 9131 subjects, capturing extensive pose, age, and ethnicity variations.
It details a rigorous data collection and filtering methodology combining automated processes and human verification for minimal label noise.
Models trained on VGGFace2 outperform those using other datasets, achieving superior face recognition on IJB-A, IJB-B, and IJB-C benchmarks.

An Overview of the VGGFace2 Dataset for Recognizing Faces Across Pose and Age

The paper presents VGGFace2, a comprehensive large-scale dataset designed to aid research in facial recognition. VGGFace2 encompasses 3.31 million images of 9131 subjects, providing substantial intra-class variation to better capture the nuances of face recognition across different poses, ages, and other demographic variables. This dataset is poised to advance the performance and generalization capability of convolutional neural networks (CNNs) for face recognition tasks.

Composition and Goals

The VGGFace2 dataset was curated with several specific goals:

Large number of identities alongside a substantial number of images per identity.
Extensive coverage of pose, age, and ethnicity variations.
Minimal label noise to ensure reliability.

To achieve these aims, the dataset underwent rigorous collection and filtering stages, combining automated processes and manual verification to maintain high accuracy and diversity.

Methodology

Data Collection and Filtering

The data collection process involved an initial list of 500,000 names from the Freebase knowledge graph, narrowed down to 9244 names through human verification. Approximately 1400 images per identity were fetched from Google Image Search, incorporating keywords to capture diverse pose and age variations. Further filtering stages included face detection using joint face detection and alignment frameworks, classification-based filtering to remove outliers, and near-duplicate removal.

Pose and Age Annotation

To facilitate research on recognizing faces across different poses and ages, subsets of evaluation data were annotated with specific templates. These templates allowed for the assessment of face recognition models under varying conditions, such as frontal, three-quarter, and profile poses, as well as different age groups.

Experimental Results

The paper delineates an extensive set of experiments to benchmark the performance of models trained on VGGFace2 against those trained on other datasets such as VGGFace and MS-Celeb-1M.

Face Identification

One of the primary evaluations involved face identification, where ResNet-50 models were trained on VGGFace, MS1M, and VGGFace2. Performance metrics showed that models trained on VGGFace2 significantly outperformed those trained on the other datasets, indicating the benefits of the high intra-class variability in VGGFace2.

Pose and Age Variation

Further experiments assessed the ability of models to recognize faces across various poses and ages. Simplicity matrices and similarity histograms illustrated that VGGFace2-trained models consistently provided higher similarity scores across different poses and age groups, establishing VGGFace2's efficacy in addressing intra-class variations.

Benchmark Performance on IJB-A, IJB-B, and IJB-C

The paper presents state-of-the-art performance results on public benchmarks:

IJB-A Dataset: Models trained on VGGFace2 achieved superior TAR and TPIR at multiple thresholds compared to those trained on MS1M and other reported results in the literature.
IJB-B Dataset: VGGFace2-trained models demonstrated significant improvements in 1:1 verification TAR and 1:N identification TPIR, further boosting their credibility.
IJB-C Dataset: Similar trends were observed with VGGFace2 significantly outperforming prior models on this extended and challenging dataset.

Implications and Future Directions

The release of VGGFace2 marks a meaningful contribution to the domain of face recognition, offering a robust dataset that promises to enhance the development of more accurate and generalizable models. The strong numerical results observed in benchmarks highlight the practical benefits of diversifying training data to include various poses and age ranges.

Looking forward, this dataset can benefit from continual updates and expansions to include more nuanced demographic variations. Additionally, the advances made with VGGFace2 can inform the design and collection methods for other large-scale datasets, optimizing them for specific recognition challenges in computer vision and AI research.

Conclusion

The VGGFace2 dataset represents a significant step in creating comprehensive and diverse benchmarks for face recognition research. The extensive experimental validation demonstrates its value in surpassing existing state-of-the-art models, thereby setting a new precedent for future advancements in the domain. Researchers are encouraged to leverage VGGFace2 to build more robust and versatile facial recognition systems.

PDF Markdown