Catch-Up Mix: Catch-Up Class for Struggling Filters in CNN (2401.13193v1)
Abstract: Deep learning has made significant advances in computer vision, particularly in image classification tasks. Despite their high accuracy on training data, deep learning models often face challenges related to complexity and overfitting. One notable concern is that the model often relies heavily on a limited subset of filters for making predictions. This dependency can result in compromised generalization and an increased vulnerability to minor variations. While regularization techniques like weight decay, dropout, and data augmentation are commonly used to address this issue, they may not directly tackle the reliance on specific filters. Our observations reveal that the heavy reliance problem gets severe when slow-learning filters are deprived of learning opportunities due to fast-learning filters. Drawing inspiration from image augmentation research that combats over-reliance on specific image regions by removing and replacing parts of images, our idea is to mitigate the problem of over-reliance on strong filters by substituting highly activated features. To this end, we present a novel method called Catch-up Mix, which provides learning opportunities to a wide range of filters during training, focusing on filters that may lag behind. By mixing activation maps with relatively lower norms, Catch-up Mix promotes the development of more diverse representations and reduces reliance on a small subset of filters. Experimental results demonstrate the superiority of our method in various vision classification datasets, providing enhanced robustness.
- Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2219–2228.
- The loss surfaces of multilayer networks. In Artificial intelligence and statistics, 192–204. PMLR.
- A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819.
- Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
- Sharpness-aware Minimization for Efficiently Improving Generalization. In International Conference on Learning Representations.
- Generalisation in humans and deep neural networks. Advances in neural information processing systems, 31.
- Deep Learning. MIT Press. http://www.deeplearningbook.org.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
- Deep pyramidal residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5927–5935.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Identity mappings in deep residual networks. In European conference on computer vision, 630–645. Springer.
- Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261.
- Natural adversarial examples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15262–15271.
- Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
- Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2261–2269. IEEE.
- SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data. In Proceedings of the AAAI Conference on Artificial Intelligence.
- GuidedMixup: an efficient mixup strategy guided by saliency maps. In Proceedings of the AAAI Conference on Artificial Intelligence, 1096–1104.
- On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. In International Conference on Learning Representations.
- Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity. In International Conference on Learning Representations.
- Puzzle mix: Exploiting saliency and local statistics for optimal mixup. In International Conference on Machine Learning, 5275–5285. PMLR.
- 3D Object Representations for Fine-Grained Categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13). Sydney, Australia.
- Krizhevsky, A. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, University of Tront.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
- On feature normalization and data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12383–12392.
- Openmixup: Open mixup toolbox and benchmark for visual representation learning. arXiv preprint arXiv:2209.04851.
- Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151.
- Learning from failure: De-biasing classifier from biased classifier. Advances in Neural Information Processing Systems, 33: 20673–20684.
- Ng, A. Y. 2004. Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on Machine learning, 78.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4510–4520.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In 2017 IEEE international conference on computer vision (ICCV), 3544–3553. IEEE.
- Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1): 1929–1958.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
- SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization. In International Conference on Learning Representations.
- AlignMixup: Improving Representations By Interpolating Aligned Features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19174–19183.
- Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, 6438–6447. PMLR.
- The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.
- Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1492–1500.
- Recursivemix: Mixed learning with history. arXiv preprint arXiv:2203.06844.
- Pyhessian: Neural networks through the lens of the hessian. In 2020 IEEE international conference on big data (Big data), 581–590. IEEE.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 6023–6032.
- Wide Residual Networks. In British Machine Vision Conference 2016. British Machine Vision Association.
- mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations.
- Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2921–2929.