CenterFace: Joint Face Detection and Alignment Using Face as Point (1911.03599v1)

Published 9 Nov 2019 in cs.CV

Abstract: Face detection and alignment in unconstrained environment is always deployed on edge devices which have limited memory storage and low computing power. This paper proposes a one-stage method named CenterFace to simultaneously predict facial box and landmark location with real-time speed and high accuracy. The proposed method also belongs to the anchor free category. This is achieved by: (a) learning face existing possibility by the semantic maps, (b) learning bounding box, offsets and five landmarks for each position that potentially contains a face. Specifically, the method can run in real-time on a single CPU core and 200 FPS using NVIDIA 2080TI for VGA-resolution images, and can simultaneously achieve superior accuracy (WIDER FACE Val/Test-Easy: 0.935/0.932, Medium: 0.924/0.921, Hard: 0.875/0.873 and FDDB discontinuous: 0.980, continuous: 0.732). A demo of CenterFace can be available at https://github.com/Star-Clouds/CenterFace.

Citations (71)

View on Semantic Scholar

Summary

The paper introduces an anchor-free approach that reformulates face detection as key point estimation, boosting efficiency and accuracy.
It employs a multi-task learning strategy with a streamlined Feature Pyramid Network and Mobilenetv2 backbone to predict face boxes and five landmarks concurrently.
The method achieves remarkable speeds—200 FPS on GPUs and 30 FPS on CPUs—making it ideal for deployment on resource-constrained edge devices.

CenterFace: Efficient Face Detection and Alignment

The paper introduces CenterFace, a novel one-stage method for joint face detection and alignment. The method is particularly designed for deployment on edge devices where computational resources are scarce. By addressing these constraints, CenterFace offers a lightweight and effective solution that balances speed and accuracy, exhibiting impressive performance across various benchmarks such as WIDER FACE and FDDB.

Key Contributions

The CenterFace approach pivots on several innovative contributions:

Anchor-Free Design: By shifting to an anchor-free paradigm, face detection is conceptualized as a standard key point estimation problem. This transformation reduces the dependency on dense anchors, thereby enhancing computational efficiency.
Multi-Task Learning Strategy: The framework simultaneously predicts face boxes and five key facial landmarks, utilizing shared features to streamline both tasks. This concurrent prediction model reduces the complexity typically associated with separate detection and alignment processes.
Feature Pyramid Network (FPN): Employing FPN with a simplified architecture, CenterFace capitalizes on robust feature extraction capabilities to ensure swift and accurate face identification and landmark localization.
Real-Time Performance: Experimental results underscore the model's real-time capabilities, achieving 200 FPS on a GPU and 30 FPS using a CPU, with superior accuracy demonstrated on datasets such as WIDER FACE (Val/Test - Easy: 0.935/0.932, Medium: 0.924/0.921, Hard: 0.875/0.873) and FDDB (Discontinuous: 0.980, Continuous: 0.732).

Methodology

CenterFace utilizes Mobilenetv2 as its backbone to maintain a low computational footprint. The core idea revolves around representing faces as points, specifically through the center of the bounding box, which facilitates the regression of face size and key landmarks directly from image features. This approach not only simplifies the model architecture but also enhances its generalization capabilities.

The training employs a Gaussian heat map to encode ground truth, with a variant of focal loss used for training stabilization and accuracy. CenterFace effectively manages the trade-off between runtime efficiency and prediction accuracy through strategic data augmentation and optimized training parameters guided by the Adam optimizer.

Experimental Evaluation

The efficacy of CenterFace is validated through rigorous testing. On the FDDB dataset, which encompasses diverse and unconstrained face images, CenterFace attains competitive ROC scores, outperforming many existing methods. On the WIDER FACE dataset, which is recognized for its challenging scenarios with complex scenes and varying sample scales, CenterFace demonstrates superior average precision, particularly excelling in the 'Hard' subsets where it outperforms methods that employ heavier backbones and more complex anchor systems.

Implications and Future Directions

The results position CenterFace as a viable tool for real-time applications where resource limitations are a constraint, such as mobile or embedded systems used in security or user-interface applications. The reduced model size of 7.2MB further reinforces its feasibility for deployment in these environments.

Future developments could focus on augmenting the robustness of CenterFace in diverse real-world conditions, potentially through integrating advanced data augmentation techniques or exploring hybrid models that balance anchor-free and anchor-based methodologies. There is also ample potential for expanding the application of CenterFace beyond facial recognition to broader object detection tasks, exploring its rigidity and adaptability in various domains.

In conclusion, CenterFace presents a significant advancement in real-time face detection and alignment, contributing a streamlined yet powerful approach that is well-suited for edge computing. With promising performance and scalability, it sets a precedent for future research in efficient object detection and feature extraction technologies.

PDF Markdown

Related Papers

GitHub

GitHub - Star-Clouds/CenterFace: face detection (1,317 stars)