A systematic study of the class imbalance problem in convolutional neural networks (1710.05381v2)

Published 15 Oct 2017 in cs.CV, cs.AI, cs.LG, cs.NE, and stat.ML

Abstract: In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that completely eliminates the imbalance, whereas the optimal undersampling ratio depends on the extent of imbalance; (iv) as opposed to some classical machine learning models, oversampling does not cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest.

PDF Abstract

Class Imbalance in CNNs: A Systematic Investigation and Comparison of Mitigation Methods

This paper presents a comprehensive paper on the impact of class imbalance on the classification performance of convolutional neural networks (CNNs) and compares several widely used methods to address this issue. Class imbalance, a prevalent problem in traditional machine learning, has seen limited systematic research within the domain of deep learning, particularly concerning CNNs. This paper aims to bridge that gap utilizing three benchmark datasets of increasing complexity: MNIST, CIFAR-10, and ImageNet.

The paper deploys an extensive experimental setup to scrutinize the effects of class imbalance on classification and to evaluate various mitigation strategies. Four primary methods are analyzed:

Oversampling: Increasing the instances of minority classes.
Undersampling: Reducing the instances of majority classes.
Two-phase training: A combination of sampling methods in separate phases.
Thresholding: Adjusting for prior class probabilities to favor balanced classification outputs.

The principal evaluation metric employed in the paper is the area under the receiver operating characteristic curve (ROC AUC), adjusted for multi-class tasks. The ROC AUC metric is chosen over overall accuracy due to accuracy's limitations when dealing with imbalanced datasets.

Key findings from the paper include:

Detrimental Effect of Imbalance: Class imbalance significantly affects the classification performance of CNNs, leading to biased learning where the model disproportionately favors the majority class.
Dominance of Oversampling: Among the methods tested, oversampling emerged as the most effective approach across almost all scenarios. This method works by replicating instances of the minority class until the class distribution is balanced.
Optimal Application of Sampling Methods: The paper indicates that oversampling should be applied to a full extent to eliminate imbalance. Conversely, the optimal degree of undersampling is found to be contingent on the extent of class imbalance, suggesting a need for careful tuning when using undersampling.
No Overfitting from Oversampling in CNNs: Unlike some classical machine learning algorithms, CNNs do not exhibit overfitting when subjected to oversampling. This characteristic makes oversampling a particularly robust method for deep learning models faced with imbalanced data.
Utility of Thresholding: Adjusting classification thresholds to compensate for prior class probabilities proves useful when the goal is to maximize the total number of correctly classified instances, enhancing the model's fairness and efficacy.

Implications and Future Directions

The practical implications of this research are substantial, particularly for fields relying heavily on image classification where class imbalance is a routine obstacle. For instance, medical image analysis often encounters this issue due to the rarity of pathological conditions compared to normal cases.

Theoretically, this paper contributes to a deeper understanding of how class imbalance affects CNNs and reinforces the necessity of appropriate mitigation strategies. The findings advocate for a default use of oversampling to handle imbalanced data in CNNs, promoting more balanced and reliable performance.

Future investigations could explore the integration of these methods with more advanced data augmentation techniques or the development of new algorithms designed to inherently counteract class imbalance. Additionally, expanding the paper to other types of neural network architectures and various domain-specific datasets would provide further insights into the generalizability of these findings.

Conclusion

This systematic investigation offers valuable evidence on the detrimental effects of class imbalance on CNN performance and identifies effective strategies to mitigate these effects. The dominance of oversampling as a mitigation technique, coupled with the nuanced use of undersampling and thresholding, provides a clear pathway for practitioners aiming to improve model performance in the presence of imbalanced datasets. These insights pave the way for further research and development in the efficient handling of class imbalance in deep learning.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Mateusz Buda (5 papers)
Atsuto Maki (22 papers)
Maciej A. Mazurowski (51 papers)

Citations (2,172)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos