FSL-Rectifier: Rectify Outliers in Few-Shot Learning via Test-Time Augmentation (2402.18292v6)
Abstract: Few-shot learning (FSL) commonly requires a model to identify images (queries) that belong to classes unseen during training, based on a few labelled samples of the new classes (support set) as reference. So far, plenty of algorithms involve training data augmentation to improve the generalization capability of FSL models, but outlier queries or support images during inference can still pose great generalization challenges. In this work, to reduce the bias caused by the outlier samples, we generate additional test-class samples by combining original samples with suitable train-class samples via a generative image combiner. Then, we obtain averaged features via an augmentor, which leads to more typical representations through the averaging. We experimentally and theoretically demonstrate the effectiveness of our method, obtaining a test accuracy improvement proportion of around 10\% (e.g., from 46.86\% to 53.28\%) for trained FSL models. Importantly, given a pretrained image combiner, our method is training-free for off-the-shelf FSL models, whose performance can be improved without extra datasets nor further training of the models themselves. Codes are available at https://github.com/WendyBaiYunwei/FSL-Rectifier-Pub.
- Diverse data augmentation with diffusions for effective test-time prompt tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2704–2714, 2023.
- Low-shot learning via covariance-preserving adversarial augmentation networks. Advances in Neural Information Processing Systems, 31, 2018.
- Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE international conference on computer vision, pages 3018–3027, 2017.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Image enhancement techniques for cockpit displays. Hughes Aircraft Co Culver City Ca Display Systems Lab, 1974.
- Learning loss for test-time augmentation. Advances in Neural Information Processing Systems, 33:4163–4174, 2020.
- Variational prototyping-encoder: One-shot learning with prototypical images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9462–9470, 2019.
- How to fine-tune models with few samples: Update, data augmentation, and test-time augmentation, 2022.
- Unsupervised image-to-image translation networks. Advances in neural information processing systems, 30, 2017.
- Few-shot unsueprvised image-to-image translation. In arxiv, 2019a.
- Few-shot unsupervised image-to-image translation, 2019b.
- Few-shot learning via feature hallucination with variational inference. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3963–3972, 2021.
- A generative model for zero shot learning using conditional variational autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 2188–2196, 2018.
- Deep generative models: Survey. In 2018 International conference on intelligent systems and computer vision (ISCV), pages 1–8. IEEE, 2018.
- Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2337–2346, 2019.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
- Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386, 2016.
- High-resolution image synthesis with latent diffusion models, 2022.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Coco-funit: Few-shot unsupervised image translation with a content conditioned style encoder. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pages 382–398. Springer, 2020.
- Generalized zero-and few-shot learning via aligned variational autoencoders. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8247–8255, 2019.
- Delta-encoder: an effective sample synthesis method for few-shot object recognition. Advances in neural information processing systems, 31, 2018.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
- Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural networks, 32:323–332, 2012.
- Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9:2579–2605, 2008.
- scikit-image: image processing in python. PeerJ, 2:e453, 2014.
- Attention is all you need, 2017.
- Generalized zero-shot learning via synthesized examples. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4281–4289, 2018.
- Matching networks for one shot learning, 2016.
- Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation. 2022.
- Generative adversarial networks: introduction and outlook. IEEE/CAA Journal of Automatica Sinica, 4(4):588–598, 2017.
- High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8798–8807, 2018a.
- Generalizing from a few examples: A survey on few-shot learning. ACM computing surveys (csur), 53(3):1–34, 2020.
- Low-shot learning from imaginary data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7278–7286, 2018b.
- One-shot image learning using test-time augmentation. In Asian Conference on Pattern Recognition, pages 3–16. Springer, 2021.
- Few-shot learning via embedding adaptation with set-to-set functions. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8808–8817, 2020.
- Zhi-Hua Zhou. Ensemble methods: foundations and algorithms. CRC press, 2012.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
- Traffic-sign detection and classification in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2110–2118, 2016.
- Yunwei Bai (1 paper)
- Ying Kiat Tan (1 paper)
- Tsuhan Chen (14 papers)
- Shiming Chen (29 papers)
- Yao Shu (29 papers)