Partial FC: Training 10 Million Identities on a Single Machine (2010.05222v2)

Published 11 Oct 2020 in cs.CV and cs.DC

Abstract: Face recognition has been an active and vital topic among computer vision community for a long time. Previous researches mainly focus on loss functions used for facial feature extraction network, among which the improvements of softmax-based loss functions greatly promote the performance of face recognition. However, the contradiction between the drastically increasing number of face identities and the shortage of GPU memories is gradually becoming irreconcilable. In this paper, we thoroughly analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. We find that the importance of negative classes in softmax function in face representation learning is not as high as we previously thought. The experiment demonstrates no loss of accuracy when training with only 10\% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency, which uses only eight NVIDIA RTX2080Ti to complete classification tasks with tens of millions of identities. The code of this paper has been made available https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc.

Citations (191)

View on Semantic Scholar

Summary

The paper proposes Partial FC, which uses a selective softmax approximation to reduce computation while maintaining high recognition accuracy on massive datasets.
It leverages the PPRN strategy by always sampling positive class centers and a subset of negatives, enabling scalable training across distributed GPUs.
Empirical results on benchmarks demonstrate negligible accuracy loss, underscoring the method's robust performance and potential for large-scale applications.

Partial FC: Training 10 Million Identities on a Single Machine

The paper entitled "Partial FC" presents a novel approach to training face recognition models on extensive datasets comprising up to ten million identities using limited computational resources. The challenge addressed is the growing need for accommodating the increased number of identities in face recognition tasks without exceeding GPU memory limits.

Key Contributions and Methodology

The authors introduce a softmax-based loss approximation strategy, called Partial FC, which allows efficient training on massive datasets by selectively sampling class centers. Their experiments reveal that focusing on positive class centers and a subset of negative class centers maintains model accuracy, challenging prior assumptions regarding the necessity of including all negative classes for training.

The critical facets of this methodology are:

Positive Plus Randomly Negative (PPRN) Strategy: Instead of computing the softmax over all classes, PPRN ensures that positive classes are always considered, while a small, random subset of negative classes suffices. This sampling strategy effectively reduces computational burden without significant accuracy loss.
Scalability with Distributed Training: By distributedly storing the linear transformation matrix across GPUs, this approach significantly decreases the computational cost and memory usage per GPU, allowing efficient model parallelism. This architecture supports training with tens of millions of identities using only eight NVIDIA RTX2080Ti GPUs.
Theoretical and Empirical Validation: The paper provides comprehensive experimentation on multiple benchmark datasets, showing comparable accuracy to state-of-the-art methods even when training with only 10% of the class centers.
Release of Glint360K Dataset: The authors deliver a cleaned and aggregated dataset containing 360,000 identities, providing a robust resource for future research.

Results and Implications

Empirically, the model achieves impressive results, maintaining accuracy within a negligible error margin of full softmax models while only requiring a fraction of the computational workload.

When tested on mainstream benchmarks such as LFW, CFP-FP, and AgeDB-30, and large-scale datasets like IJB-B and MegaFace, Partial FC holds nearly consistent verification performance, indicating robustness across different conditions and scales.
The innovative sampling technique and distributed architecture collectively enable training up to 100 million identities, a previously computationally prohibitive task.

Broader Impact and Future Directions

This work has immediate practical implications by reducing resource requirements for extensive face recognition tasks and potentially expanding the scale of deployable systems in industrial applications. Additionally, it proposes theoretical insights into the role of negative classes within softmax-based losses, potentially influencing future model design considerations.

Future developments could explore adaptive sampling strategies to dynamically balance computational efficiency with model performance further and apply the insights from Partial FC to other large-scale classification tasks beyond face recognition.

In summary, the Partial FC paper contributes an efficient, memory-conserving training strategy with empirical and theoretical strength, paving the way for scalable AI applications in face recognition and beyond.

PDF Markdown

Related Papers

GitHub

GitHub - deepinsight/insightface: State-of-the-art 2D and 3D Face Analysis Project (21,684 stars)

Tweets

https://twitter.com/Norod78/status/1737377506003505539

https://twitter.com/BubbleTroubl_rg/status/1907088950792991019

https://twitter.com/joacodok/status/1838363105144795261

https://twitter.com/spyced/status/1860892473045749914

https://twitter.com/fofrAI/status/1773814274764218540

https://twitter.com/UniMatrixZ0/status/1796629718478778495