Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SSH: Single Stage Headless Face Detector (1708.03979v3)

Published 14 Aug 2017 in cs.CV

Abstract: We introduce the Single Stage Headless (SSH) face detector. Unlike two stage proposal-classification detectors, SSH detects faces in a single stage directly from the early convolutional layers in a classification network. SSH is headless. That is, it is able to achieve state-of-the-art results while removing the "head" of its underlying classification network -- i.e. all fully connected layers in the VGG-16 which contains a large number of parameters. Additionally, instead of relying on an image pyramid to detect faces with various scales, SSH is scale-invariant by design. We simultaneously detect faces with different scales in a single forward pass of the network, but from different layers. These properties make SSH fast and light-weight. Surprisingly, with a headless VGG-16, SSH beats the ResNet-101-based state-of-the-art on the WIDER dataset. Even though, unlike the current state-of-the-art, SSH does not use an image pyramid and is 5X faster. Moreover, if an image pyramid is deployed, our light-weight network achieves state-of-the-art on all subsets of the WIDER dataset, improving the AP by 2.5%. SSH also reaches state-of-the-art results on the FDDB and Pascal-Faces datasets while using a small input size, leading to a runtime of 50 ms/image on a GPU. The code is available at https://github.com/mahyarnajibi/SSH.

Citations (411)

Summary

  • The paper introduces a novel single-stage face detection framework that removes fully connected layers to reduce computational cost.
  • It leverages a headless VGG-16 architecture with multi-scale detection modules to efficiently detect faces of varying sizes.
  • Extensive experiments show superior accuracy and a fivefold speed improvement over traditional pyramid-based detectors.

Overview of "SSH: Single Stage Headless Face Detector"

The paper introduces the Single Stage Headless (SSH) face detector, a novel approach in the domain of face detection that emphasizes efficiency and speed while achieving state-of-the-art results. The main contribution of SSH is its ability to detect faces in a single stage, directly from the early convolutional layers of a classification network, specifically utilizing a headless version of VGG-16.

Key Features and Methodology

SSH is characterized by several innovative features:

  1. Headless Architecture: SSH eliminates the fully connected layers, or the "head," of the VGG-16 network. This removal significantly reduces the computational complexity and parameter count, allowing the model to be both lightweight and fast.
  2. Scale-Invariance: Unlike traditional methods that rely on processing an image pyramid, SSH achieves scale-invariance by design. It detects faces of various sizes in a single forward pass by leveraging different convolutional layers within the network, each specialized for different face scales.
  3. Detection Modules: SSH employs three detection modules on feature maps with varying strides—8, 16, and 32—to detect small, medium, and large faces, respectively. This multi-scale design facilitates efficient scale handling and improves detection speed and accuracy.
  4. Enhanced Context Modeling: To model context efficiently, SSH integrates convolutional layers to expand the effective receptive field. This design choice allows SSH to mimic the effect of larger detection windows used in two-stage detectors.

Experimental Results

Extensive experiments were conducted on the WIDER, FDDB, and Pascal Faces datasets to validate SSH's performance:

  • WIDER Dataset: SSH demonstrated superior performance over previous state-of-the-art methods, including those employing more complex architectures like ResNet-101. It achieved higher average precision scores while offering a fivefold speed improvement when not using an input pyramid.
  • FDDB and Pascal Faces: On these datasets, SSH maintained its state-of-the-art status, outperforming existing methods in both accuracy and computational efficiency.

Implications and Future Directions

SSH provides a compelling alternative to traditional two-stage detectors, especially in applications where processing speed and resource constraints are critical. The headless architecture coupled with its single-stage design offers valuable insights into reducing computational demands without sacrificing performance.

The practical implications of SSH extend to real-time face detection in various applications, including security and surveillance systems, where rapid and accurate face detection is paramount.

Future work could explore the integration of SSH with other advanced network architectures, experiment with different forms of context modeling, and further optimize the detection modules for diverse tasks beyond face detection.

In summary, the SSH face detector stands out due to its blend of simplicity, efficiency, and performance, paving the way for further advancements in the field of real-time object detection.

Github Logo Streamline Icon: https://streamlinehq.com