Class-Balanced Loss Based on Effective Number of Samples (1901.05555v1)

Published 16 Jan 2019 in cs.CV

Abstract: With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{{n})/(1-\beta)$,} where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Citations (2,075)

View on Semantic Scholar

Summary

The paper introduces a novel loss function that reweights class losses based on the effective number of samples to mitigate data imbalance.
It applies a theoretical framework that models diminishing returns of additional samples, yielding significant improvements on CIFAR, iNaturalist, and ImageNet datasets.
The study offers practical insights for visual recognition tasks and proposes future work on refining data overlap estimation and cross-domain applications.

Class-Balanced Loss Based on Effective Number of Samples

The paper "Class-Balanced Loss Based on Effective Number of Samples" addresses the challenge of training deep neural networks on long-tailed datasets where a small number of classes dominate the dataset, leading to poor performance for under-represented classes. The authors introduce a theoretical framework to redefine the notion of the number of samples by incorporating the concept of effective samples, which accounts for data overlap present in the dataset.

Theoretical Framework for Effective Number of Samples

The core idea behind this work is the diminishing marginal utility of additional samples due to overlaps in the data space. The authors present a framework in which each sample in the feature space is associated with a neighboring region rather than a single point. The effective number of samples, defined as the volume of these neighboring regions, is described by the formula: $E_n = \frac{1 - \beta^n}{1 - \beta}$ where $n$ represents the number of samples, and $\beta$ is a hyperparameter in the interval $[0, 1)$ . This formula ensures that as $n$ increases, the marginal benefit derived from each additional sample decreases, capturing the redundancy in data.

Designing the Class-Balanced Loss

Building upon this framework, the paper proposes a novel class-balanced loss that inversely weights the loss by the effective number of samples. This approach is designed to mitigate the impact of imbalanced datasets without introducing the issues associated with traditional re-sampling and re-weighting methods. Formally, for a given sample from class $i$ , the class-balanced (CB) loss is defined as: $\text{CB}(\mathbf{p}, y) = \frac{1 - \beta}{1 - \beta^{n_y}} \mathcal{L}(\mathbf{p}, y)$ where $n_y$ is the total number of samples in class $y$ , and $\mathcal{L}$ represents the original loss function applied to the predicted probabilities $\mathbf{p}$ .

The class-balanced term $\frac{1 - \beta}{1 - \beta^{n_y}}$ effectively re-weights the loss, thereby reducing the imbalance among classes. This concept can be applied to various loss functions, including softmax cross-entropy loss, sigmoid cross-entropy loss, and focal loss.

Empirical Evaluation

The authors validate the efficacy of their method through extensive experiments on both artificially created long-tailed CIFAR datasets and real-world datasets including iNaturalist and ImageNet. The results highlight significant performance improvements when using the class-balanced loss compared to traditional losses.

Long-Tailed CIFAR

For artificially created long-tailed CIFAR-10 and CIFAR-100 datasets with varying imbalance factors, the class-balanced loss consistently outperformed baseline methods. Notably, the optimal choice of $\beta$ varied between datasets, with larger values of $\beta$ yielding better performance for CIFAR-10, while smaller values were preferable for the more fine-grained CIFAR-100 dataset.

Large-Scale Real-World Datasets

On the iNaturalist and ImageNet datasets, the class-balanced focal loss demonstrated substantial gains over the softmax cross-entropy loss. For instance, on iNaturalist 2018, ResNet-101 with class-balanced focal loss recorded a top-1 error of 36.12%, significantly lower than 42.57% achieved by the softmax variant.

Practical Implications and Future Directions

This research introduces a theoretically grounded, practical approach to handle the imbalance in long-tailed datasets. By quantifying the effective number of samples, the proposed method provides a more robust and less heuristic-dependent solution compared to traditional re-weighting strategies. The adaptability of the class-balanced loss makes it applicable to a broad range of visual recognition tasks.

Future work may explore further refinements in estimating data overlap through model-specific assumptions or learning-based approaches. Additionally, investigating the extension of this framework to other domains beyond visual recognition could uncover broader applications and benefits.

In summary, the introduction of class-balanced loss based on effective number of samples presents a substantial step forward in addressing class imbalance in large-scale datasets, offering a theoretically sound and empirically validated approach that enhances model performance on under-represented classes.

PDF Markdown

Related Papers

YouTube

Show All Videos