Cost Sensitive Learning of Deep Feature Representations from Imbalanced Data (1508.03422v3)

Published 14 Aug 2015 in cs.CV

Abstract: Class imbalance is a common problem in the case of real-world object detection and classification tasks. Data of some classes is abundant making them an over-represented majority, and data of other classes is scarce, making them an under-represented minority. This imbalance makes it challenging for a classifier to appropriately learn the discriminating boundaries of the majority and minority classes. In this work, we propose a cost sensitive deep neural network which can automatically learn robust feature representations for both the majority and minority classes. During training, our learning procedure jointly optimizes the class dependent costs and the neural network parameters. The proposed approach is applicable to both binary and multi-class problems without any modification. Moreover, as opposed to data level approaches, we do not alter the original data distribution which results in a lower computational cost during the training process. We report the results of our experiments on six major image classification datasets and show that the proposed approach significantly outperforms the baseline algorithms. Comparisons with popular data sampling techniques and cost sensitive classifiers demonstrate the superior performance of our proposed method.

Citations (841)

View on Semantic Scholar

Summary

The paper introduces cost-sensitive loss modifications that enhance CNN feature learning on imbalanced datasets.
It incorporates tailored versions of MSE, SVM hinge, and CE losses to prioritize under-represented classes.
Experimental results on six datasets demonstrate superior accuracy, F-measure, and G-mean performance over traditional methods.

Cost-Sensitive Learning of Deep Feature Representations from Imbalanced Data

Introduction

The paper "Cost-Sensitive Learning of Deep Feature Representations from Imbalanced Data" addresses a significant challenge in machine learning, particularly in the domain of image classification: class imbalance. This issue arises when certain classes are under-represented (minority classes), and others are over-represented (majority classes), leading to biased learning outcomes that favor the majority classes. The authors propose a cost-sensitive learning approach integrated into Convolutional Neural Networks (CNNs) to robustly learn feature representations for both majority and minority classes without altering the original data distribution.

Methodology

The core of the proposed solution lies in modifying the learning process of CNNs by incorporating class-dependent costs into the loss function. The main contributions include the introduction of cost-sensitive versions of widely used loss functions like Mean Square Error (MSE), Support Vector Machine (SVM) hinge loss, and Cross Entropy (CE) loss. These cost-sensitive modifications allow the network to prioritize learning features from under-represented classes more effectively.

Cost Matrix

A fundamental aspect of this approach is the adaptive cost matrix, which, unlike traditional cost matrices, is designed to prevent instability in the training process by ensuring all costs are positive and within the range (0,1]. This cost matrix is dynamically set based on data statistics, such as class distribution and class separability, during the training process, introducing a class-sensitive penalty adapted through each epoch.

Loss Functions

The cost-sensitive modifications for MSE, SVM hinge loss, and CE loss are described as follows:

Cost-Sensitive MSE: Incorporates class-specific penalties directly into the logistic function.
Cost-Sensitive SVM Hinge Loss: Adjusts scores based on class-dependent costs.
Cost-Sensitive CE Loss: Integrates class-dependent costs into the softmax function, maintaining calibration for classification tasks.

Experimental Results

The effectiveness of the proposed method is validated on six diverse image classification datasets:

Edinburgh Dermofit Image Library (DIL) - For melanoma detection, significant performance improvements were demonstrated over traditional methods and baseline CNNs.
Moorea Labelled Corals (MLC) - Enhanced performance in both within-year and cross-year experiments, highlighting the robustness of the cost-sensitive approach in handling real-world imbalanced data.
Caltech-101 and MIT-67 - Extended experiments on original imbalanced data distributions and deliberately imbalanced splits showed the approach's efficacy in different scenarios.
MNIST and CIFAR-100 - For balanced datasets with artificially induced imbalances, the proposed method demonstrated superior performance over baseline CNNs and maintained competitive performance on standard splits.

Performance Metrics

Performance was quantified using standard metrics like overall classification accuracy, as well as F-measure and G-mean scores. The proposed method consistently outperformed traditional and state-of-the-art class-imbalance handling techniques, including over-sampling, under-sampling, hybrid sampling, and cost-sensitive versions of SVM and Random Forest classifiers. This was particularly evident in extreme imbalance cases, showcasing the adaptive nature of cost-sensitive learning in deep networks.

Practical and Theoretical Implications

The practical implications of this research are vast, providing a robust framework for improving classification accuracy in real-world applications where class imbalance is prevalent, such as medical diagnosis or rare event detection. The theoretical contributions include the formal analysis of cost-sensitive loss functions to ensure they maintain essential properties for classification tasks, such as classification calibration and guess aversion.

Conclusion

The paper provides a comprehensive methodology for addressing class imbalance in deep learning models through cost-sensitive learning mechanisms. The significant gains achieved in classification performance across various datasets underscore the potential of adaptive cost-sensitive learning in enhancing the robustness and fairness of image classification models. Future research could explore integrating this approach with other augmentation techniques and extending its application to other domains, such as natural language processing and time-series analysis, where class imbalance is a persistent challenge.

PDF Markdown