Remix: Rebalanced Mixup (2007.03943v3)

Published 8 Jul 2020 in cs.CV and cs.LG

Abstract: Deep image classifiers often perform poorly when training data are heavily class-imbalanced. In this work, we propose a new regularization technique, Remix, that relaxes Mixup's formulation and enables the mixing factors of features and labels to be disentangled. Specifically, when mixing two samples, while features are mixed in the same fashion as Mixup, Remix assigns the label in favor of the minority class by providing a disproportionately higher weight to the minority class. By doing so, the classifier learns to push the decision boundaries towards the majority classes and balance the generalization error between majority and minority classes. We have studied the state-of-the art regularization techniques such as Mixup, Manifold Mixup and CutMix under class-imbalanced regime, and shown that the proposed Remix significantly outperforms these state-of-the-arts and several re-weighting and re-sampling techniques, on the imbalanced datasets constructed by CIFAR-10, CIFAR-100, and CINIC-10. We have also evaluated Remix on a real-world large-scale imbalanced dataset, iNaturalist 2018. The experimental results confirmed that Remix provides consistent and significant improvements over the previous methods.

Authors (5)

Hsin-Ping Chou (2 papers)
Shih-Chieh Chang (10 papers)
Jia-Yu Pan (9 papers)
Wei Wei (425 papers)
Da-Cheng Juan (38 papers)

Citations (214)

View on Semantic Scholar

Summary

An Examination of "Remix: Rebalanced Mixup"

The paper "Remix: Rebalanced Mixup" introduces a novel approach to address the challenge of class imbalance in training data when using deep image classifiers. Class imbalance can skew the model's perception towards majority classes, leading to poor performance on minority class samples. The authors propose Remix, a regularization technique that modifies the traditional Mixup approach to accommodate the imbalance between classes by separately adjusting the mixing factors for features and labels. This separation allows the assignment of labels to favor the minority class, thereby potentially aligning the decision boundary towards the majority class and minimizing generalization error across diverse classes.

The authors conduct extensive studies on existing state-of-the-art methods such as Mixup, Manifold Mixup, and CutMix, comparing them with their proposed approach under imbalanced conditions in datasets such as CIFAR-10, CIFAR-100, CINIC-10, and iNaturalist 2018. The results from these experiments demonstrate that Remix consistently surpasses the performance of these established techniques, as well as surpasses several re-weighting and re-sampling techniques, thereby reinforcing the method's effectiveness in handling imbalanced distributions.

Methodological Contributions

Relaxation of Mixing Factor: Remix deviates from the traditional Mixup by applying different mixing factors to features and labels, thereby relaxing the stringent linear interpolation constraint applied by classical Mixup. This allows the label space to be adjusted to favor the minority class during training.
Performance Superior to State-of-the-Art: The experimental results on synthesized imbalanced datasets and the real-world iNaturalist 2018 dataset exhibit that Remix delivers substantial improvements in accuracy over both baseline and advanced imbalanced handling techniques.
Integration Capability: Remix is shown to integrate seamlessly with existing re-weighting and re-sampling methods, enhancing their performance through increased adaptation to class imbalances.

Implications and Future Directions

The development of Remix represents a significant advancement in training deep learning models under conditions of data imbalance. The method's ability to improve classification performance implies potential applications in various real-world scenarios where data imbalance is prevalent, such as medical diagnostics, species identification, or any domain with rare-event classification. The computational efficiency of Remix further renders it a practical choice for large-scale datasets.

Theoretically, the disentanglement of mixing factors provides a new dimension through which to analyze class representation in neural networks. This insight may influence subsequent methodological innovations in both the field of regularization techniques and more broadly in dataset augmentation strategies.

For future exploration, more rigorous theoretical analysis could unveil subtle mechanics underlying Remix's performance, potentially leading to further optimization of this technique. The exploration of how Remix aligns with or diverges from other regularization methods may yield deeper insights into designing robust classifiers. Moreover, considering the integration of Remix with neural architectures beyond image classifiers, such as those used in sequence modeling, might expand its contribution to other domains like natural language processing.

In conclusion, "Remix: Rebalanced Mixup" offers a compelling solution to a perennial problem in machine learning, with experimental results that validate its effectiveness and promise for future applications.

PDF Markdown

Related Papers

YouTube

Show All Videos