Suppressing Uncertainties for Large-Scale Facial Expression Recognition (2002.10392v2)

Published 24 Feb 2020 in cs.CV

Abstract: Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators. These uncertainties lead to a key challenge of large-scale Facial Expression Recognition (FER) in deep learning era. To address this problem, this paper proposes a simple yet efficient Self-Cure Network (SCN) which suppresses the uncertainties efficiently and prevents deep networks from over-fitting uncertain facial images. Specifically, SCN suppresses the uncertainty from two different aspects: 1) a self-attention mechanism over mini-batch to weight each training sample with a ranking regularization, and 2) a careful relabeling mechanism to modify the labels of these samples in the lowest-ranked group. Experiments on synthetic FER datasets and our collected WebEmotion dataset validate the effectiveness of our method. Results on public benchmarks demonstrate that our SCN outperforms current state-of-the-art methods with \textbf{88.14}\% on RAF-DB, \textbf{60.23}\% on AffectNet, and \textbf{89.35}\% on FERPlus. The code will be available at \href{https://github.com/kaiwang960112/Self-Cure-Network}{https://github.com/kaiwang960112/Self-Cure-Network}.

Citations (457)

View on Semantic Scholar

Summary

The paper introduces a Self-Cure Network that mitigates annotation uncertainties using weighted loss, ranking regularization, and a relabeling mechanism.
It employs a self-attention module to assign importance weights, reducing overfitting on ambiguous facial images.
The approach outperforms state-of-the-art methods on RAF-DB, AffectNet, and FERPlus, demonstrating robust accuracy on noisy datasets.

Suppressing Uncertainties for Large-Scale Facial Expression Recognition

The paper presents a novel Self-Cure Network (SCN) designed to improve the efficacy of Facial Expression Recognition (FER) by addressing uncertainties inherent in large-scale datasets. Annotating diverse and complex facial images often results in inconsistencies and errors due to ambiguous expressions, low-quality images, and subjective biases among annotators. These challenges can significantly hinder the training and performance of deep learning models in FER.

Methodology

The proposed SCN is structured around three core modules:

Self-Attention Importance Weighting: This module applies a self-attention mechanism that assigns an importance weight to each training sample. Designed to mitigate the overfitting of uncertain samples, this mechanism uses a fully connected layer to determine the significance of each image feature, which is then employed for loss weighting. The weighted loss function, termed Logit-Weighted Cross-Entropy Loss (WCE-Loss), adjusts model updates by emphasizing more reliable samples.
Ranking Regularization: To refine the model's focus on certain over uncertain samples, the SCN incorporates a Rank Regularization (RR) module. This involves ranking the importance weights, splitting them into high and low importance groups, and enforcing a margin constraint through a Rank Regularization loss. This design ensures that the SCN self-tunes to highlight confident samples and minimize the impact of ambiguous data.
Relabeling Mechanism: For samples identified with low importance, a careful relabeling process is conducted. If the maximum prediction probability exceeds the given label's probability by a defined margin, the sample is reassigned a pseudo-label. This mechanism enhances model integrity by correcting potentially mislabeled data.

Results

The SCN demonstrates its superiority by outperforming state-of-the-art methods on multiple benchmarking datasets. On RAF-DB, AffectNet, and FERPlus, SCN achieved accuracy rates of 88.14%, 60.23%, and 89.35%, respectively. These results underscore the network's robustness against uncertainties, validated further on both synthetic and real-world noisy datasets, including a novel WebEmotion dataset. The significant performance gains on noisy datasets indicate the SCN's potential for resilience in practical FER scenarios where data imperfections are prevalent.

Implications and Future Work

The development of SCN represents a meaningful step towards more reliable FER systems. Practically, it holds promise for applications needing robust emotion detection, such as human-computer interaction and behavioral analysis. Theoretically, SCN contributes to understanding how uncertainty can be systematically managed in data-driven AI models.

Future research directions may explore enhancing SCN by integrating more sophisticated attention mechanisms or incorporating temporal information for video-based FER. Additionally, expanding SCN to other domains with similar uncertainty issues could universalize its applicability across different deep learning challenges.

In conclusion, the Self-Cure Network offers a structured and effective approach to tackling annotation uncertainties in FER, paving the way for more reliable emotion recognition systems.

PDF Markdown

Related Papers

GitHub

GitHub - kaiwang960112/Self-Cure-Network: This is a novel and easy method for annotation uncertainties. (409 stars)