Papers
Topics
Authors
Recent
2000 character limit reached

Gradient Harmonized Single-stage Detector (1811.05181v1)

Published 13 Nov 2018 in cs.CV

Abstract: Despite the great success of two-stage detectors, single-stage detector is still a more elegant and efficient way, yet suffers from the two well-known disharmonies during training, i.e. the huge difference in quantity between positive and negative examples as well as between easy and hard examples. In this work, we first point out that the essential effect of the two disharmonies can be summarized in term of the gradient. Further, we propose a novel gradient harmonizing mechanism (GHM) to be a hedging for the disharmonies. The philosophy behind GHM can be easily embedded into both classification loss function like cross-entropy (CE) and regression loss function like smooth-$L_1$ ($SL_1$) loss. To this end, two novel loss functions called GHM-C and GHM-R are designed to balancing the gradient flow for anchor classification and bounding box refinement, respectively. Ablation study on MS COCO demonstrates that without laborious hyper-parameter tuning, both GHM-C and GHM-R can bring substantial improvement for single-stage detector. Without any whistles and bells, our model achieves 41.6 mAP on COCO test-dev set which surpasses the state-of-the-art method, Focal Loss (FL) + $SL_1$, by 0.8.

Citations (502)

Summary

  • The paper introduces novel loss functions, GHM-C and GHM-R, to rebalance gradient contributions and address sample imbalance in training.
  • It demonstrates improved training stability and efficiency, achieving a 0.8 mAP gain and 41.6 mAP on the MS COCO test-dev set compared to prior methods.
  • The proposed method shows broad applicability, extending benefits to both single-stage and two-stage detectors in object detection tasks.

Gradient Harmonized Single-stage Detector

The paper "Gradient Harmonized Single-stage Detector" authored by Buyu Li, Yu Liu, and Xiaogang Wang from the Chinese University of Hong Kong explores the inherent challenges faced by single-stage object detection frameworks compared to their two-stage counterparts. This paper introduces a novel approach using a Gradient Harmonizing Mechanism (GHM) to enhance the training efficiency and detection performance of single-stage detectors.

Context and Problem Statement

Single-stage detectors are favored for their efficiency and simplicity but are plagued by the imbalance between positive and negative samples and the disparity between easy and hard instances during training. This imbalance results in inefficient training and suboptimal model performance. Traditional methods like Online Hard Example Mining (OHEM) discard a significant number of samples, while techniques such as Focal Loss adjust the loss function at the cost of added complexity and the need for hyper-parameter tuning.

Proposed Solution: Gradient Harmonizing Mechanism

The authors propose addressing these imbalances through a novel lens—by examining the distribution of gradient norms. They posit that both class imbalance and difficulty imbalance can be reduced to an imbalance in gradient norm distribution. The proposed GHM dynamically adjusts the gradient contributions of different samples during training, leading to two new loss functions: GHM-C for classification and GHM-R for regression.

GHM aims to harmonize the contribution of gradients, reducing the overwhelming influence of numerous easy examples and stabilizing the effect of outliers. By re-weighting examples based on their gradient density, GHM achieves more balanced training.

Key Contributions

  1. GHM-C and GHM-R Loss Functions: The introduction of these loss functions enables efficient control of gradient flow during training without extensive hyper-parameter tuning. They provide robustness to changes in data distribution, making the training process more efficient and stable.
  2. Implementation Efficiency: Even with unit region approximation and exponential moving average (EMA) to stabilize training, the proposed method ensures competitive training time without compromising performance.
  3. Empirical Evaluation: Experiments conducted on the MS COCO dataset demonstrate significant improvements over baseline methods, with a 0.8 mAP increase over the previous state-of-the-art Focal Loss method.
  4. Broad Applicability: Although designed for single-stage detectors, GHM-R has shown benefits for two-stage detectors as well, indicating the generalizability of the proposed mechanism beyond its original scope.

Results and Implications

The results on the COCO test-dev set indicate that the implementation of GHM-C and GHM-R in the RetinaNet framework achieves a mean average precision (mAP) of 41.6, surpassing state-of-the-art methods. This suggests that optimizing gradient flow can effectively manage training imbalances typically observed in object detection tasks.

The implications of this research are twofold. Practically, it offers a more robust approach to training single-stage detectors, potentially improving their adoption in scenarios where computational efficiency is paramount. Theoretically, it sheds new light on how gradient distributions can be manipulated to address training challenges in machine learning.

Future Directions

Future research avenues may include further refining the understanding of the optimal gradient distribution during training. Additionally, exploring the integration of GHM with alternative object detection architectures and tasks could provide deeper insights into its potential application scope.

While this paper provides a compelling advancement in balancing gradient contributions, there remains potential to explore how various configurations of GHM may impact other model architectures and learning tasks, potentially paving the way for broader adoption in various AI applications.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.