Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression (1904.07399v3)

Published 16 Apr 2019 in cs.CV

Abstract: Heatmap regression with a deep network has become one of the mainstream approaches to localize facial landmarks. However, the loss function for heatmap regression is rarely studied. In this paper, we analyze the ideal loss function properties for heatmap regression in face alignment problems. Then we propose a novel loss function, named Adaptive Wing loss, that is able to adapt its shape to different types of ground truth heatmap pixels. This adaptability penalizes loss more on foreground pixels while less on background pixels. To address the imbalance between foreground and background pixels, we also propose Weighted Loss Map, which assigns high weights on foreground and difficult background pixels to help training process focus more on pixels that are crucial to landmark localization. To further improve face alignment accuracy, we introduce boundary prediction and CoordConv with boundary coordinates. Extensive experiments on different benchmarks, including COFW, 300W and WFLW, show our approach outperforms the state-of-the-art by a significant margin on various evaluation metrics. Besides, the Adaptive Wing loss also helps other heatmap regression tasks. Code will be made publicly available at https://github.com/protossw512/AdaptiveWingLoss.

Citations (240)

View on Semantic Scholar

Summary

The paper introduces the Adaptive Wing Loss, a novel loss function that dynamically adjusts error sensitivity to improve facial landmark localization.
It employs a weighted loss map and CoordConv layers to emphasize crucial pixels and enhance boundary predictions during training.
The approach achieves superior performance with reduced NME and failure rates on COFW, 300W, and WFLW datasets under challenging conditions.

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

The paper entitled "Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression" introduces novel contributions to the domain of facial landmark detection using deep learning methodologies. This work targets the improvements in the loss function used for heatmap regression, focusing particularly on facial landmark localization—a significant component in various face-related applications such as recognition, frontalization, and reconstruction.

Overview

The paper identifies key issues with the existing Mean Square Error (MSE) loss in heatmap regression, noting its limitations in adjusting to small errors. These constraints result in predicted heatmaps that are often blurry and exhibit low intensity on foreground pixels. To address this, the authors propose the Adaptive Wing (AWing) loss function, specifically designed to adjust its behavior based on ground truth pixel values, thereby enhancing the model’s ability to accurately localize landmarks.

Contributions

Adaptive Wing Loss: The AWing loss adapts its form according to pixel intensities, providing strong influence on small errors for foreground pixels while maintaining a MSE-like behavior for background pixels. This strategy decreases influence on accurate regressions, facilitating better convergence.
Weighted Loss Map: In response to the imbalance between foreground and background pixels, this approach assigns higher weights to crucial pixels, such as those on landmarks or close to them, emphasizing their significance during training.
Integration of Boundary Information: By embedding boundary prediction within the model, along with the use of CoordConv layers that integrate coordinate information, the architectural design enhances localization accuracy through improved context understanding.

These innovations allowed the proposed model to achieve superior results across several benchmark datasets, notably including COFW, 300W, and WFLW. The model demonstrated enhanced robustness in challenging conditions, such as large pose variations and occlusions.

Results

Significant improvements were evident in terms of Normalized Mean Error (NME), Failure Rate (FR), and Area Under Curve (AUC) across various testing datasets:

COFW Dataset: The model achieved a substantial reduction in failure rate, setting a new benchmark in the challenge posed by occlusions.
300W Dataset: Whether evaluated on general or specific challenging subsets, the proposed approach consistently outperformed existing methods.
WFLW Dataset: It showed noteworthy performance in handling datasets more complex due to pose, occlusion, and outlier conditions.

Implications and Future Directions

The introduction of AWing loss extends beyond facial alignment tasks and can potentially benefit other heatmap regression fields, like human pose estimation. Future exploration may involve tailoring the AWing loss for diverse visual tasks and further optimizing its parameters. Additionally, integrating more complex coordinate transformations and boundary predictions could advance the capabilities of convolutional architectures, specifically in spatially contextual scenarios.

By focusing on adjusting the loss function to better handle different pixel roles in a heatmap, this research contributes to a more nuanced approach in deep learning-based image processing tasks. The adaptability and robustness highlighted in the results speculate a positive trajectory for such methodologies' application to broad AI and computer vision problems.