Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IPMix: Label-Preserving Data Augmentation Method for Training Robust Classifiers (2310.04780v7)

Published 7 Oct 2023 in cs.CV

Abstract: Data augmentation has been proven effective for training high-accuracy convolutional neural network classifiers by preventing overfitting. However, building deep neural networks in real-world scenarios requires not only high accuracy on clean data but also robustness when data distributions shift. While prior methods have proposed that there is a trade-off between accuracy and robustness, we propose IPMix, a simple data augmentation approach to improve robustness without hurting clean accuracy. IPMix integrates three levels of data augmentation (image-level, patch-level, and pixel-level) into a coherent and label-preserving technique to increase the diversity of training data with limited computational overhead. To further improve the robustness, IPMix introduces structural complexity at different levels to generate more diverse images and adopts the random mixing method for multi-scale information fusion. Experiments demonstrate that IPMix outperforms state-of-the-art corruption robustness on CIFAR-C and ImageNet-C. In addition, we show that IPMix also significantly improves the other safety measures, including robustness to adversarial perturbations, calibration, prediction consistency, and anomaly detection, achieving state-of-the-art or comparable results on several benchmarks, including ImageNet-R, ImageNet-A, and ImageNet-O.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (92)
  1. Improving fractal pre-training. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2412–2421, 2021.
  2. Commonality in natural images rescues gans: Pretraining gans with generic and privacy-free synthetic data. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7844–7854, 2022.
  3. Preventing manifold intrusion with locality: Local mixup. ArXiv, abs/2201.04368, 2022.
  4. Learning to see by looking at noise. In Neural Information Processing Systems, 2021.
  5. Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. In Neural Information Processing Systems, 2019.
  6. Defending against image corruptions through adversarial augmentations. ArXiv, abs/2104.01086, 2021.
  7. Palm: Scaling language modeling with pathways. ArXiv, abs/2204.02311, 2022.
  8. An empirical evaluation on robustness and uncertainty of regularization methods. ArXiv, abs/2003.03879, 2020.
  9. Robustbench: a standardized adversarial robustness benchmark. ArXiv, abs/2010.09670, 2020.
  10. Autoaugment: Learning augmentation strategies from data. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 113–123, 2019.
  11. Randaugment: Practical automated data augmentation with a reduced search space. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 3008–3017, 2019.
  12. Imagenet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
  13. Improved regularization of convolutional neural networks with cutout. ArXiv, abs/1708.04552, 2017.
  14. Boosting adversarial attacks with momentum. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9185–9193, 2017.
  15. An image is worth 16x16 words: Transformers for image recognition at scale. ArXiv, abs/2010.11929, 2020.
  16. Cut, paste and learn: Surprisingly easy synthesis for instance detection. 2017 IEEE International Conference on Computer Vision (ICCV), pages 1310–1319, 2017.
  17. Network anomaly detection using lstm based autoencoder. Proceedings of the 16th ACM Symposium on QoS and Security for Wireless and Mobile Networks, 2020.
  18. Adversarial attacks against medical deep learning systems. ArXiv, abs/1804.05296, 2018.
  19. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. ArXiv, abs/1811.12231, 2018.
  20. Generalisation in humans and deep neural networks. ArXiv, abs/1808.08750, 2018.
  21. Keepaugment: A simple information-preserving data augmentation approach. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1055–1064, 2020.
  22. Open-vocabulary object detection via vision and language knowledge distillation. In International Conference on Learning Representations, 2021.
  23. On calibration of modern neural networks. ArXiv, abs/1706.04599, 2017.
  24. Mixup as locally linear out-of-manifold regularization. In AAAI Conference on Artificial Intelligence, 2018.
  25. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2015.
  26. The many faces of robustness: A critical analysis of out-of-distribution generalization. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8320–8329, 2020.
  27. Unsolved problems in ml safety. ArXiv, abs/2109.13916, 2021.
  28. Benchmarking neural network robustness to common corruptions and perturbations. ArXiv, abs/1903.12261, 2018.
  29. A baseline for detecting misclassified and out-of-distribution examples in neural networks. ArXiv, abs/1610.02136, 2016.
  30. Deep anomaly detection with outlier exposure. ArXiv, abs/1812.04606, 2018.
  31. Augmix: A simple data processing method to improve robustness and uncertainty. ArXiv, abs/1912.02781, 2019.
  32. Natural adversarial examples. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15257–15266, 2019.
  33. Pixmix: Dreamlike pictures comprehensively improve safety measures. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16762–16771, 2021.
  34. Language is not all you need: Aligning perception with language models. ArXiv, abs/2302.14045, 2023.
  35. Replacing labeled real-image datasets with auto-generated contours. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21200–21209, 2022.
  36. Formula-driven supervised learning with recursive tiling patterns. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 4081–4088, 2021.
  37. Pre-training without natural images. International Journal of Computer Vision, 130:990 – 1007, 2021.
  38. Co-mixup: Saliency guided joint mixup with supermodular diversity. ArXiv, abs/2102.03065, 2021.
  39. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. ArXiv, abs/2009.06962, 2020.
  40. Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
  41. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60:84 – 90, 2012.
  42. Accurate uncertainties for deep learning using calibrated regression. ArXiv, abs/1807.00263, 2018.
  43. Cutpaste: Self-supervised learning for anomaly detection and localization. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9659–9669, 2021.
  44. Imagenet-e: Benchmarking neural network robustness via attribute editing. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20371–20381, 2023.
  45. Fast autoaugment. In Neural Information Processing Systems, 2019.
  46. Tokenmix: Rethinking image mixing for data augmentation in vision transformers. In European Conference on Computer Vision, 2022.
  47. Automix: Unveiling the power of mixup for stronger classifiers. In European Conference on Computer Vision, 2021.
  48. Affinity and diversity: Quantifying mechanisms of data augmentation. ArXiv, abs/2002.08973, 2020.
  49. Improving robustness without sacrificing accuracy with patch gaussian augmentation. ArXiv, abs/1906.02611, 2019.
  50. Towards deep learning models resistant to adversarial attacks. ArXiv, abs/1706.06083, 2017.
  51. Voxel transformer for 3d object detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3144–3153, 2021.
  52. Voxnet: A 3d convolutional neural network for real-time object recognition. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 922–928, 2015.
  53. On detecting adversarial perturbations. ArXiv, abs/1702.04267, 2017.
  54. On interaction between augmentations and corruptions in natural corruption robustness. ArXiv, abs/2102.11273, 2021.
  55. Prime: A few primitives can boost robustness to common corruptions. In European Conference on Computer Vision, 2021.
  56. When does label smoothing help? In Neural Information Processing Systems, 2019.
  57. Trivialaugment: Tuning-free yet state-of-the-art data augmentation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 754–762, 2021.
  58. OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
  59. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. ArXiv, abs/1906.02530, 2019.
  60. Sok: Security and privacy in machine learning. 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pages 399–414, 2018.
  61. Do imagenet classifiers generalize to imagenet? In International Conference on Machine Learning, 2019.
  62. Towards total recall in industrial anomaly detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14298–14308, 2021.
  63. Increasing the robustness of dnns against image corruptions by playing the game of noise. ArXiv, abs/2001.06057, 2020.
  64. Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128:336–359, 2016.
  65. Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface. ArXiv, abs/2303.17580, 2023.
  66. Applying network analysis to explore the global scientific literature on food security. Ecol. Informatics, 56:101062, 2020.
  67. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. ArXiv, abs/2001.07685, 2020.
  68. Evaluating model robustness and stability to dataset shift. In International Conference on Artificial Intelligence and Statistics, 2021.
  69. Improved mixed-example data augmentation. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1262–1270, 2018.
  70. Ricap: Random image cropping and patching data augmentation for deep cnns. In Asian Conference on Machine Learning, 2018.
  71. Llama: Open and efficient foundation language models. ArXiv, abs/2302.13971, 2023.
  72. Robustness may be at odds with accuracy. arXiv: Machine Learning, 2018.
  73. Saliencymix: A saliency guided data augmentation strategy for better regularization. ArXiv, abs/2006.01791, 2020.
  74. Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, 2018.
  75. Normface: L2 hypersphere embedding for face verification. Proceedings of the 25th ACM international conference on Multimedia, 2017.
  76. Cosface: Large margin cosine loss for deep face recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5265–5274, 2018.
  77. Learning robust global representations by penalizing local predictive power. In Neural Information Processing Systems, 2019.
  78. Augmax: Adversarial composition of random augmentations for robust training. In Neural Information Processing Systems, 2021.
  79. Zero-shot image classification based on deep feature extraction. IEEE Transactions on Cognitive and Developmental Systems, 10:432–444, 2018.
  80. Self-instruct: Aligning language model with self generated instructions. ArXiv, abs/2212.10560, 2022.
  81. Adversarial examples improve image recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 816–825, 2019.
  82. Feature denoising for improving adversarial robustness. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 501–509, 2018.
  83. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5987–5995, 2016.
  84. Point cloud pre-training with natural 3d structures. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21251–21261, 2022.
  85. Pixor: Real-time 3d object detection from point clouds. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7660, 2018.
  86. Cutmix: Regularization strategy to train strong classifiers with localizable features. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 6022–6031, 2019.
  87. Wide residual networks. ArXiv, abs/1605.07146, 2016.
  88. Lit: Zero-shot transfer with locked-image text tuning. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18102–18112, 2021.
  89. Theoretically principled trade-off between robustness and accuracy. ArXiv, abs/1901.08573, 2019.
  90. mixup: Beyond empirical risk minimization. ArXiv, abs/1710.09412, 2017.
  91. Adversarial autoaugment. ArXiv, abs/1912.11188, 2019.
  92. Random erasing data augmentation. In AAAI Conference on Artificial Intelligence, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zhenglin Huang (6 papers)
  2. Xiaoan Bao (1 paper)
  3. Na Zhang (55 papers)
  4. Qingqi Zhang (2 papers)
  5. Xiaomei Tu (1 paper)
  6. Biao Wu (101 papers)
  7. Xi Yang (160 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.