TransformMix: Learning Transformation and Mixing Strategies from Data (2403.12429v1)
Abstract: Data augmentation improves the generalization power of deep learning models by synthesizing more training samples. Sample-mixing is a popular data augmentation approach that creates additional data by combining existing samples. Recent sample-mixing methods, like Mixup and Cutmix, adopt simple mixing operations to blend multiple inputs. Although such a heuristic approach shows certain performance gains in some computer vision tasks, it mixes the images blindly and does not adapt to different datasets automatically. A mixing strategy that is effective for a particular dataset does not often generalize well to other datasets. If not properly configured, the methods may create misleading mixed images, which jeopardize the effectiveness of sample-mixing augmentations. In this work, we propose an automated approach, TransformMix, to learn better transformation and mixing augmentation strategies from data. In particular, TransformMix applies learned transformations and mixing masks to create compelling mixed images that contain correct and important information for the target tasks. We demonstrate the effectiveness of TransformMix on multiple datasets in transfer learning, classification, object detection, and knowledge distillation settings. Experimental results show that our method achieves better performance as well as efficiency when compared with strong sample-mixing baselines.
- YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
- MODALS: Modality-agnostic automated data augmentation in the latent space. In 9th International Conference on Learning Representations, ICLR 2021, 2021.
- AdaAug: Learning class- and instance-adaptive data augmentation policies. In 10th International Conference on Learning Representations, ICLR 2022, 2022.
- A survey of automated data augmentation for image classification: Learning to compose, mix, and generate. IEEE transactions on neural networks and learning systems, PP, 06 2023. doi: 10.1109/TNNLS.2023.3282258.
- A downsampled variant of imagenet as an alternative to the CIFAR datasets. arXiv preprint arXiv:1707.08819, 2017.
- AutoAugment: Learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 113–123. IEEE, 2019.
- RandAugment: Practical automated data augmentation with a reduced search space. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2020, pp. 3008–3017. IEEE, 2020.
- SuperMix: Supervising the mixing data augmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 13794–13803. IEEE, 2021.
- ImageNet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 248–255. IEEE, 2009.
- Incorporating intra-class variance to fine-grained visual recognition. In 2017 IEEE International Conference on Multimedia and Expo, ICME 2017, pp. 1452–1457. IEEE, 2017.
- The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338, June 2010.
- Mixup as locally linear out-of-manifold regularization. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, pp. 3714–3722. AAAI Press, 2019.
- Identity mappings in deep residual networks. In Computer Vision - ECCV 2016 - 14th European Conference, volume 9908, pp. 630–645. Springer, 2016a.
- Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778. IEEE, 2016b.
- AugMix: A simple data processing method to improve robustness and uncertainty. In 8th International Conference on Learning Representations, ICLR 2020, 2020.
- Population Based Augmentation: efficient learning of augmentation policy schedules. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97, pp. 2731–2741. PMLR, 2019.
- Spatial transformer networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 2017–2025, 2015.
- Puzzle Mix: Exploiting saliency and local statistics for optimal mixup. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, volume 119 of Proceedings of Machine Learning Research, pp. 5275–5285. PMLR, 2020.
- Co-Mixup: Saliency guided joint mixup with supermodular diversity. In 9th International Conference on Learning Representations, ICLR 2021, 2021.
- Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorization, 2013.
- A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012., pp. 1106–1114, 2012.
- Fast AutoAugment. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, pp. 6662–6672, 2019.
- AutoMix: Unveiling the power of mixup. arXiv preprint arXiv:2103.13027, 2021.
- Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
- Automated flower classification over a large number of classes. In Sixth Indian Conference on Computer Vision, Graphics & Image Processing, ICVGIP 2008, pp. 722–729. IEEE, 2008.
- Resizemix: Mixing data with preserved object information and true labels. arXiv preprint arXiv:2012.11101, 2020.
- Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 91–99, 2015.
- A survey on image data augmentation for deep learning. Journal of Big Data, 6:60, 2019.
- SaliencyMix: A saliency guided data augmentation strategy for better regularization. In 9th International Conference on Learning Representations, ICLR 2021, 2021.
- Manifold Mixup: Better representations by interpolating hidden states. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97 of Proceedings of Machine Learning Research, pp. 6438–6447. PMLR, 2019.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In 2019 IEEE International Conference on Computer Vision, ICCV 2019, pp. 6022–6031. IEEE, 2019.
- Wide residual networks. In Proceedings of the British Machine Vision Conference 2016, BMVC 2016. BMVA Press, 2016.
- Mixup: Beyond empirical risk minimization. In 6th International Conference on Learning Representations, ICLR 2018, 2018.
- Learning deep features for discriminative localization. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 2921–2929. IEEE, 2016.
- Tsz-Him Cheung (1 paper)
- Dit-Yan Yeung (78 papers)