Patch-based Probabilistic Image Quality Assessment for Face Selection and Improved Video-based Face Recognition (1304.0869v2)

Published 3 Apr 2013 in cs.CV and stat.AP

Abstract: In video based face recognition, face images are typically captured over multiple frames in uncontrolled conditions, where head pose, illumination, shadowing, motion blur and focus change over the sequence. Additionally, inaccuracies in face localisation can also introduce scale and alignment variations. Using all face images, including images of poor quality, can actually degrade face recognition performance. While one solution it to use only the "best" subset of images, current face selection techniques are incapable of simultaneously handling all of the abovementioned issues. We propose an efficient patch-based face image quality assessment algorithm which quantifies the similarity of a face image to a probabilistic face model, representing an "ideal" face. Image characteristics that affect recognition are taken into account, including variations in geometric alignment (shift, rotation and scale), sharpness, head pose and cast shadows. Experiments on FERET and PIE datasets show that the proposed algorithm is able to identify images which are simultaneously the most frontal, aligned, sharp and well illuminated. Further experiments on a new video surveillance dataset (termed ChokePoint) show that the proposed method provides better face subsets than existing face selection techniques, leading to significant improvements in recognition accuracy.

Citations (318)

View on Semantic Scholar

Summary

The paper proposes a patch-based probabilistic approach that assesses facial image quality by comparing input patches to an ideal face model.
It leverages image normalization, patch extraction, and DCT-based feature extraction to compute robust quality scores under varying conditions.
Empirical evaluations on FERET, PIE, and ChokePoint datasets demonstrate improved face selection and enhanced video-based recognition accuracy.

Probabilistic Image Quality Assessment in Video-Based Face Recognition

The paper presents an innovative approach to tackle the challenges of ensuring high-quality face images in uncontrolled video environments, which are critical for enhanced video-based face recognition. The authors address the issue of deteriorating recognition performance due to low-quality face images captured across multiple frames under various conditions such as head pose changes, varying illumination, motion blur, and alignment discrepancies.

Methodology Overview

The authors propose a patch-based probabilistic image quality assessment algorithm that forgoes the typical fusion-based quality assessment in favor of a more integrated and straightforward approach. The methodology centers around building a probabilistic model representing an 'ideal' face, to which input face images are compared to derive a quality score. The process is comprised of five main steps:

Image Normalization: Pixel-based non-linear preprocessing is applied to reduce the dynamic range using logarithm transformation.
Patch Extraction: Images are divided into overlapping patches, each normalized in terms of mean and variance to handle contrast variations.
Feature Extraction: A feature vector is obtained from each patch using a 2D Discrete Cosine Transform (DCT), focusing on the top few low-frequency components to capture general facial textures.
Local Probability Calculation: Probabilities for each feature vector are calculated using a trained normal distribution model, capturing the deviation from an 'ideal' face model.
Overall Quality Score Generation: Quality scores for the complete image are generated through summation of logarithmic probabilities of all patches, enabling a holistic assessment of face quality.

Empirical Evaluation

The paper provides a thorough assessment using both still image datasets (FERET and PIE) and a new video surveillance dataset (ChokePoint). Strong empirical results are reported:

On still images: The method effectively selects high-quality images in terms of alignment, pose, illumination, shadows, and sharpness. The algorithm outperforms existing fusion-based and individual classifier-based methods by successfully identifying overall high-quality face images across diverse conditions.
On video sequences: The proposed technique substantially improves face verification accuracy by selecting top-quality subsets, demonstrating the method's robustness and efficacy in real-world surveillance environments.

Implications and Future Directions

The proposed quality assessment strategy sets a new benchmark for practicality in processing video-based face recognition tasks. It simplifies the complex integration of multiple face image characteristics into a singular quality metric, enhancing real-time applicability without requiring intensive computation.

In terms of implications, the approach facilitates better subset selections, thus improving the face recognition accuracy, especially vital in settings like surveillance where high variability in image quality is a known challenge. The lack of dependence on intricate system-specific details makes the method adaptable across various systems without extensive retraining requirements.

Looking forward, while current results are promising, future research could focus on enriching the model to manage subtle variations better, particularly in illumination and expression, without compromising generalization. Additionally, further evaluations on larger-scale video datasets might provide deeper insights into optimizing the proposed method's system integration and exploring its full potential in diverse real-world applications.

PDF Markdown