Additive Margin Softmax for Face Verification (1801.05599v4)

Published 17 Jan 2018 in cs.CV

Abstract: In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at https://github.com/happynear/AMSoftmax

Citations (1,230)

View on Semantic Scholar

Summary

The paper introduces an additive angular margin loss (AM-Softmax) to enhance inter-class separation and reduce intra-class variance.
It employs feature and weight normalization with a scaling parameter, simplifying the optimization process for deep face models.
Experimental results show a 93.51% verification rate on LFW BLUFR and 72.47% Rank-1 accuracy on MegaFace, outperforming previous methods.

Additive Margin Softmax for Face Verification: An Overview

Paper: Additive Margin Softmax for Face Verification Authors: Feng Wang, Weiyang Liu, Haijun Liu, Jian Cheng.

Introduction

The paper addresses face verification through the development of a new loss function known as Additive Margin Softmax (AM-Softmax). The premise of this work rests on improving the classification loss function used in deep face verification models to optimize inter-class separation while reducing intra-class variation effectively.

Motivation

Prior methods such as Large-margin Softmax and Angular Softmax emphasized incorporating angular margins in a multiplicative manner. Such approaches aim to enhance the discriminative power of deep features. The authors posit that an additive angular margin could provide a more intuitive and interpretable framework, alongside being straightforward to implement.

Proposed Method

The core innovation of this paper is the AM-Softmax loss function, which introduces an additive angular margin by modifying the traditional softmax formulation. The function is defined explicitly as: $\psi(\theta) = \cos\theta - m$ where $\theta$ is the angle between the feature vector and the class weight vector, and $m$ is a margin hyperparameter. This function modifies the softmax loss to enforce a fixed angular margin, contributing to better feature discrimination.

Implementation Details

The practical implementation involves normalizing both feature vectors and class weight vectors to lie on a hypersphere, followed by scaling these values using a hyperparameter $s$ . The resulting loss function becomes: $\mathcal{L}_{AMS} = -\frac{1}{n} \sum_{i=1}^n \log \frac{e^{s (\cos\theta_{y_i} - m)}}{e^{s (\cos\theta_{y_i} - m)} + \sum_{j \neq y_i} e^{s \cos\theta_j}}$ This normalization and scaling simplify the feature space, making the angle differences the primary component of class separation.

Experimental Results

The proposed AM-Softmax loss function was evaluated on benchmark datasets LFW and MegaFace, demonstrating superior performance compared to existing state-of-the-art methods. Key results include:

LFW BLUFR Protocol: Achieved a verification rate (VR) of 93.51% at FAR=0.1% and a VR of 97.69% at FAR=0.01%.
MegaFace: Recorded a Rank-1 accuracy of 72.47% and a VR of 84.44% at FAR=1e-6.

Discussion

Geometric Interpretation

The decision boundary of traditional softmax lies on a hyperplane equidistant from class weight vectors. In contrast, the AM-Softmax shifts this boundary to form a marginal region defined by the angular margin $m$ , providing clear and interpretable decision boundaries.

Feature Normalization

Feature normalization has been emphasized for its correlation with image quality. Normalizing features ensures that gradients focus more on low-quality images, which helps to achieve more robust performance across varying image qualities.

Comparative Analysis

The simplicity of AM-Softmax, requiring fewer hyperparameter tuning steps compared to A-Softmax’s annealing strategy, together with its superior performance metrics, underscores its practical applicability. Furthermore, visualization experiments show that AM-Softmax achieves compact and well-separated feature clusters with minimal intra-class variance.

Implications and Future Directions

The AM-Softmax loss function contributes both to the theoretical understanding of margin-based loss functions and the practical advancement of face recognition technologies. The clear geometric properties and implementation simplicity make it appealing for deployment in real-world systems. Future research could explore adaptive margin schemes, potentially incorporating class-specific or sample-specific margins to enhance performance further.

Conclusion

This paper presents a compelling alternative to existing margin-based loss functions with the introduction of AM-Softmax. By combining additive margins with feature normalization, the method not only achieves superior performance but also offers ease of implementation and intuitive understanding, contributing significantly to the advancement of deep face verification technologies.

PDF Markdown

Related Papers

GitHub

GitHub - happynear/AMSoftmax: A simple yet effective loss function for face verification. (488 stars)