Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks (1711.06753v5)

Published 17 Nov 2017 in cs.CV

Abstract: We present a new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs). We first compare and analyse different loss functions including L2, L1 and smooth L1. The analysis of these loss functions suggests that, for the training of a CNN-based localisation model, more attention should be paid to small and medium range errors. To this end, we design a piece-wise loss function. The new loss amplifies the impact of errors from the interval (-w, w) by switching from L1 loss to a modified logarithm function. To address the problem of under-representation of samples with large out-of-plane head rotations in the training set, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation approaches. Last, the proposed approach is extended to create a two-stage framework for robust facial landmark localisation. The experimental results obtained on AFLW and 300W demonstrate the merits of the Wing loss function, and prove the superiority of the proposed method over the state-of-the-art approaches.

Citations (386)

View on Semantic Scholar

Collections

Summary

The paper introduces Wing loss, a novel loss function designed to improve the accuracy and robustness of CNNs for facial landmark localization by being more sensitive to small and medium errors.
A Pose-Based Data Balancing (PDB) strategy and a two-stage localization framework are proposed alongside Wing loss to address data imbalance and refine landmark detection.
Evaluations on standard datasets show that the proposed Wing loss and techniques reduce normalized mean error significantly compared to previous methods, improving performance across different network architectures.

Overview of "Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks"

The paper "Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks" presents an innovative approach to facial landmark localization, crucial for various computer vision tasks. This paper introduces a new loss function, referred to as Wing loss, specifically designed to enhance the robustness and accuracy of Convolutional Neural Network (CNN)-based facial landmark localization.

Introduction to Facial Landmark Localization Challenges

Facial landmark localization involves identifying the coordinates of specific key points on a face, which is a fundamental step for numerous face-related tasks such as recognition and 3D reconstruction. Traditional localization models, while effective in controlled environments, often struggle with unconstrained conditions due to diverse factors such as pose, expression, and occlusion, thus necessitating more robust solutions.

Key Contributions

Novel Wing Loss Function: The paper first presents a detailed analysis of existing loss functions like L1, L2, and Smooth L1, revealing a need for greater sensitivity to small and medium errors. The Wing loss function addresses this by enhancing the influence of these errors during network training, through a piecewise function that transitions from an L1 loss to a logarithmic function in a specified error interval. This allows for better handling of outliers and improved network convergence.
Pose-Based Data Balancing (PDB): The paper identifies an imbalance in training data, particularly under-represented samples with significant head rotations. To address this, a data augmentation strategy known as pose-based data balancing is introduced. By replicating and perturbing minority samples, the method effectively balances the dataset, thereby enhancing model performance on diverse pose variations.
Two-Stage Localization Framework: A two-stage approach is implemented to refine landmark localization, where an initial CNN performs rapid rough localization, followed by a second network for fine-grained localization. This mitigates inaccuracies in initial face detection and effectively handles in-plane rotations.

Experimental Validation and Numerical Results

The proposed Wing loss function and associated strategies were evaluated on the AFLW and 300W datasets, commonly used benchmarks for facial landmark localization. The results clearly demonstrate the superiority of the Wing loss over conventional loss functions, with significant reductions in normalized mean error. Notably, the Wing loss led to an average error decrease of over 13% compared to leading approaches.

Additionally, the experiments showcased the effectiveness of the PDB strategy and the two-stage framework, with further improvements achieved by integrating these methodologies. Furthermore, the experiments highlighted the Wing loss's potential across different network architectures, including simple CNN models and complex ResNet frameworks.

Implications and Future Directions

The research offers substantial improvements in the domain of facial landmark localization, setting a precedent for future work. The Wing loss function, due to its robust handling of outliers and small errors, holds potential applications in other regression tasks within computer vision. As model architectures continue to evolve, integrating specialized loss functions like Wing loss will be pivotal in enhancing deep learning capabilities.

Future exploration might focus on extending Wing loss to other domains within computer vision, as well as optimizing data augmentation techniques to further enhance model performance. Additionally, exploring the integration of Wing loss with various network architectures could uncover additional performance benefits, driving advancement in AI capabilities across diverse applications.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now