Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GuidedMixup: An Efficient Mixup Strategy Guided by Saliency Maps (2306.16612v1)

Published 29 Jun 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Data augmentation is now an essential part of the image training process, as it effectively prevents overfitting and makes the model more robust against noisy datasets. Recent mixing augmentation strategies have advanced to generate the mixup mask that can enrich the saliency information, which is a supervisory signal. However, these methods incur a significant computational burden to optimize the mixup mask. From this motivation, we propose a novel saliency-aware mixup method, GuidedMixup, which aims to retain the salient regions in mixup images with low computational overhead. We develop an efficient pairing algorithm that pursues to minimize the conflict of salient regions of paired images and achieve rich saliency in mixup images. Moreover, GuidedMixup controls the mixup ratio for each pixel to better preserve the salient region by interpolating two paired images smoothly. The experiments on several datasets demonstrate that GuidedMixup provides a good trade-off between augmentation overhead and generalization performance on classification datasets. In addition, our method shows good performance in experiments with corrupted or reduced datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Frequency-tuned salient region detection. In 2009 IEEE conference on computer vision and pattern recognition, 1597–1604. IEEE.
  2. Bishop, C. M. 1995. Training with noise is equivalent to Tikhonov regularization. Neural computation, 7(1): 108–116.
  3. Pattern recognition and machine learning, volume 4. Springer.
  4. A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819.
  5. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee.
  6. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv preprint arXiv:1708.04552.
  7. Edmonds, J. 1965. Maximum matching and a polyhedron with 0,1-vertices. Journal of Research of the National Bureau of Standards Section B Mathematics and Mathematical Physics, 125.
  8. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, 28(4): 594–611.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  10. Identity mappings in deep residual networks. In European conference on computer vision, 630–645. Springer.
  11. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty. Proceedings of the International Conference on Learning Representations (ICLR).
  12. Saliency detection: A spectral residual approach. In 2007 IEEE Conference on computer vision and pattern recognition, 1–8. Ieee.
  13. Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2261–2269. IEEE.
  14. Deep networks with stochastic depth. In European conference on computer vision, 646–661. Springer.
  15. SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data. In Proceedings of the AAAI Conference on Artificial Intelligence.
  16. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11): 1254–1259.
  17. Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity. In International Conference on Learning Representations.
  18. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. In International Conference on Machine Learning, 5275–5285. PMLR.
  19. 3D Object Representations for Fine-Grained Categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13). Sydney, Australia.
  20. Krizhevsky, A. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, University of Tront.
  21. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
  22. Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning. In Eighth International Conference on Learning Representations, ICLR 2020. International Conference on Learning Representations.
  23. Smart augmentation learning an optimal data augmentation strategy. Ieee Access, 5: 5858–5869.
  24. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151.
  25. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, 722–729. IEEE.
  26. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
  27. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15(56): 1929–1958.
  28. Does Knowledge Distillation Really Work? arXiv preprint arXiv:2106.05945.
  29. Tutte, W. T. 1954. A Short Proof of the Factor Theorem for Finite Graphs. Canadian Journal of Mathematics, 6: 347 – 352.
  30. SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization. In International Conference on Learning Representations.
  31. Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, 6438–6447. PMLR.
  32. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.
  33. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 6023–6032.
  34. Wide Residual Networks. In British Machine Vision Conference 2016. British Machine Vision Association.
  35. mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations.
  36. Saliency detection by multi-context deep learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1265–1274.
  37. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2921–2929.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Minsoo Kang (12 papers)
  2. Suhyun Kim (16 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.