BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition (1912.02413v4)

Published 5 Dec 2019 in cs.CV and cs.LG

Abstract: Our work focuses on tackling the challenging but natural visual recognition task of long-tailed data distribution (i.e., a few classes occupy most of the data, while most classes have rarely few samples). In the literature, class re-balancing strategies (e.g., re-weighting and re-sampling) are the prominent and effective methods proposed to alleviate the extreme imbalance for dealing with long-tailed problems. In this paper, we firstly discover that these re-balancing methods achieving satisfactory recognition accuracy owe to that they could significantly promote the classifier learning of deep networks. However, at the same time, they will unexpectedly damage the representative ability of the learned deep features to some extent. Therefore, we propose a unified Bilateral-Branch Network (BBN) to take care of both representation learning and classifier learning simultaneously, where each branch does perform its own duty separately. In particular, our BBN model is further equipped with a novel cumulative learning strategy, which is designed to first learn the universal patterns and then pay attention to the tail data gradually. Extensive experiments on four benchmark datasets, including the large-scale iNaturalist ones, justify that the proposed BBN can significantly outperform state-of-the-art methods. Furthermore, validation experiments can demonstrate both our preliminary discovery and effectiveness of tailored designs in BBN for long-tailed problems. Our method won the first place in the iNaturalist 2019 large scale species classification competition, and our code is open-source and available at https://github.com/Megvii-Nanjing/BBN.

PDF Abstract

Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

The paper, "BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition," addresses the pervasive challenge in visual recognition tasks characterized by long-tailed distributions. In such datasets, a few classes dominate in terms of data volume, while most classes are significantly underrepresented. The researchers propose a novel approach that seeks to balance the conventional class re-balancing strategies—namely re-weighting and re-sampling—with a new perspective that aims to maintain both effective representation learning and classifier learning.

Core Contributions and Methodology

The paper presents the Bilateral-Branch Network (BBN), augmented with a cumulative learning strategy aimed at improving long-tailed visual recognition:

The BBN model integrates two branches: a conventional learning branch utilizing uniform sampling to preserve representation learning, and a re-balancing branch with a reversed sampler to effectively target the tail classes. These branches jointly contribute to the final output in a balanced manner.
The cumulative learning strategy dynamically adjusts the emphasis between representation learning and classifier learning throughout the training process. This strategy employs a parabolic decay function to modulate the trade-off parameter $\alpha$ , ensuring that the network focuses on learning robust universal features initially and then gradually shifts attention to improving classifier performance, particularly for tail classes.

Experimental Validation

The authors validate their approach through extensive experimentation across multiple benchmark datasets including CIFAR-10, CIFAR-100, iNaturalist 2017, and iNaturalist 2018. Key results highlight:

CIFAR Experiments: The BBN model consistently outperforms state-of-the-art methods and baseline models (CE-DRW, CE-DRS, Mixup, LDAM-DRW) across various imbalance ratios. On CIFAR-10 with an imbalance ratio of 100, BBN achieves a top-1 error rate of 20.18%, significantly lower than the best-performing baseline.
iNaturalist Experiments: On the large-scale, long-tailed datasets iNaturalist 2017 and 2018, the BBN model also achieves superior performance, with substantial reductions in top-1 error rates compared to LDAM-DRW and other baselines.

Analysis and Insights

The researchers delve into the mechanics of re-balancing strategies, demonstrating that while these strategies enhance classifier learning, they inadvertently impair the quality of the learned representations. By isolating the representation and classifier learning stages in controlled experiments, they validate that traditional strategies like re-weighting and re-sampling lead to inferior feature representations despite improved classifier performance.

Moreover, the paper presents a comparative analysis of different adaptive strategies for the trade-off parameter $\alpha$ . Notably, the parabolic decay function used in BBN is empirically shown to be the most effective, ensuring adequate epochs of focus on representation learning before transitioning to classifier learning.

Implications and Future Work

The findings and methodology of this paper have significant implications for developing robust visual recognition models in imbalanced data scenarios. The BBN model's balanced approach to enhancing representation and classifier learning offers a comprehensive solution that can be applied across various domains, from ecology to medical image analysis, where long-tailed distributions are prevalent.

Future research could investigate the adaptation of the BBN framework for other tasks such as object detection and segmentation under long-tailed distributions. Additionally, exploring alternative adaptive strategies for $\alpha$ and potential hybrid models integrating other state-of-the-art techniques could further enhance performance and applicability.

In summary, this paper presents a meticulous and well-validated approach to addressing long-tailed visual recognition. By leveraging a bilateral-branch network and a cumulative learning strategy, the authors strike a critical balance between maintaining robust feature representations and enhancing classifier learning, setting a new benchmark for future research in this domain.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Boyan Zhou (4 papers)
Quan Cui (10 papers)
Xiu-Shen Wei (40 papers)
Zhao-Min Chen (4 papers)

Citations (738)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - megvii-research/BBN: The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition (658 stars)