Circle Loss: A Unified Perspective of Pair Similarity Optimization (2002.10857v2)

Published 25 Feb 2020 in cs.CV

Abstract: This paper provides a pair similarity optimization viewpoint on deep feature learning, aiming to maximize the within-class similarity $s_p$ and minimize the between-class similarity $s_n$. We find a majority of loss functions, including the triplet loss and the softmax plus cross-entropy loss, embed $s_n$ and $s_p$ into similarity pairs and seek to reduce $(s_n-s_p)$. Such an optimization manner is inflexible, because the penalty strength on every single similarity score is restricted to be equal. Our intuition is that if a similarity score deviates far from the optimum, it should be emphasized. To this end, we simply re-weight each similarity to highlight the less-optimized similarity scores. It results in a Circle loss, which is named due to its circular decision boundary. The Circle loss has a unified formula for two elemental deep feature learning approaches, i.e. learning with class-level labels and pair-wise labels. Analytically, we show that the Circle loss offers a more flexible optimization approach towards a more definite convergence target, compared with the loss functions optimizing $(s_n-s_p)$. Experimentally, we demonstrate the superiority of the Circle loss on a variety of deep feature learning tasks. On face recognition, person re-identification, as well as several fine-grained image retrieval datasets, the achieved performance is on par with the state of the art.

Citations (805)

View on Semantic Scholar

Summary

The paper introduces Circle Loss that re-weights pair similarities to enhance optimization flexibility and establish a clear convergence target.
It employs a circular decision boundary and adaptive weighting factors to surpass traditional loss functions like triplet and softmax cross-entropy.
Experimental results show improved performance in face recognition, person re-identification, and fine-grained image retrieval compared to state-of-the-art methods.

Circle Loss: A Unified Perspective of Pair Similarity Optimization

Introduction

The paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization" offers a novel approach to optimize pair similarities in deep feature learning. Authors Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei tackle the inherent inflexibility and ambiguous convergence status associated with traditional loss functions such as triplet loss and softmax cross-entropy loss. By introducing Circle loss, the authors aim to enhance optimization flexibility and achieve a more definite convergence target. This summary will explore the core contributions, technical innovations, and implications of Circle loss as presented in the paper.

Core Contributions

The authors primarily contribute the following:

Unified Perspective on Learning Paradigms: They provide a cohesive view of two learning paradigms—class-level label learning and pair-wise label learning—under the Circle loss framework.
Flexible Optimization: Circle loss re-weights pair similarities, ensuring that similarity scores deviating far from the optimum receive stronger penalties, thus optimizing at different paces.
Definite Convergence Target: The proposed loss function establishes a circular decision boundary, resulting in a more specific and stable convergence status.

Technical Innovations

Circle loss differentiates itself from traditional loss functions through several technical aspects that yield significant performance gains.

Weighted Similarity Optimization: Traditional loss functions like triplet loss aim to reduce the difference between within-class similarity ( $s_p$ ) and between-class similarity ( $s_n$ ) by optimizing $(s_n - s_p)$ . This approach enforces equal penalty strengths on all similarity scores, lacking flexibility. Circle loss overcomes this by dynamically adjusting the penalty based on the distance from the optimal similarity, formulated as $\alpha_n s_n -\alpha_p s_p$ , where $\alpha_n$ and $\alpha_p$ are adaptive weighting factors.
Circular Decision Boundary: By introducing a circular decision boundary in the $(s_n, s_p)$ space, Circle loss promotes a more precise convergence target. Specifically, this boundary is defined as:

$(s_n-0)^2 + (s_p-1)^2 = 2m^2,$

where $m$ is a relaxation margin.
Hyper-parameter Robustness: Circle loss features only two hyper-parameters, $\gamma$ (scale factor) and $m$ (margin), and demonstrates robustness across a wide range of settings. This simplicity is achieved by setting fixed optimal points and margins for within-class and between-class similarities, $O_p$ and $O_n$ .

Experimental Validation

The efficacy of Circle loss is validated across various domains:

Face Recognition: Using the MS-Celeb-1M dataset for training and multiple datasets for testing (including MegaFace Challenge 1 and IJB-C), Circle loss outperforms state-of-the-art methods like AM-Softmax and ArcFace. For instance, it achieves 98.50% and 98.73% accuracy with ResNet100 on MegaFace Challenge 1 Rank-1 identification and verification tasks, respectively.
Person Re-identification: Tested on Market-1501 and MSMT17 datasets, Circle loss demonstrates performance superiority over traditional Softmax, AM-Softmax, and state-of-the-art models like MGN through higher Rank-1 and mAP scores.
Fine-grained Image Retrieval: On datasets such as CUB-200-2011, Cars196, and Stanford Online Products, Circle loss offers results comparable with advanced techniques designed specifically for pair-wise learning tasks.

Implications

The introduction of Circle loss has practical and theoretical implications. From a practical standpoint, it improves the discriminative power of deep features across various tasks, enhancing state-of-the-art systems in face recognition, person re-identification, and fine-grained image retrieval. Theoretically, Circle loss underscores the importance of flexible and balanced optimization in deep feature learning, promoting a shift towards more adaptive and fine-grained control of similarity metrics.

Future Directions

Circle loss opens several avenues for future exploration:

Integration with Other Models: Investigating the integration of Circle loss with other tasks and models to evaluate its broad applicability.
Further Theoretical Analysis: Extending the theoretical analysis of Circle loss to better understand its benefits and potential limitations in different contexts.
Automated Hyper-parameter Tuning: Developing approaches for automated tuning of $\gamma$ and $m$ to further simplify the use of Circle loss in various applications.

By fostering a more nuanced understanding and optimization of similarity metrics in deep feature learning, Circle loss stands to contribute significantly to advancements in AI and computer vision.

PDF Markdown

Related Papers

YouTube

Show All Videos