Unleash the Power of Local Representations for Few-Shot Classification (2407.01967v1)
Abstract: Generalizing to novel classes unseen during training is a key challenge of few-shot classification. Recent metric-based methods try to address this by local representations. However, they are unable to take full advantage of them due to (i) improper supervision for pretraining the feature extractor, and (ii) lack of adaptability in the metric for handling various possible compositions of local feature sets. In this work, we unleash the power of local representations in improving novel-class generalization. For the feature extractor, we design a novel pretraining paradigm that learns randomly cropped patches by soft labels. It utilizes the class-level diversity of patches while diminishing the impact of their semantic misalignments to hard labels. To align network output with soft labels, we also propose a UniCon KL-Divergence that emphasizes the equal contribution of each base class in describing "non-base" patches. For the metric, we formulate measuring local feature sets as an entropy-regularized optimal transport problem to introduce the ability to handle sets consisting of homogeneous elements. Furthermore, we design a Modulate Module to endow the metric with the necessary adaptability. Our method achieves new state-of-the-art performance on three popular benchmarks. Moreover, it exceeds state-of-the-art transductive and cross-modal methods in the fine-grained scenario.
- Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Few-shot classification with feature map reconstruction networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8012–8021, June 2021.
- Learning to affiliate: Mutual centralized learning for few-shot classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14411–14420, June 2022.
- Deepemd: Differentiable earth mover’s distance for few-shot learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5):5632–5648, 2023.
- Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems, volume 15, 2002.
- Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9062–9071, October 2021.
- Pushing the limits of simple pipelines for few-shot learning: External data and fine-tuning make a difference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9068–9077, June 2022.
- Noise or signal: The role of image backgrounds in object recognition. In International Conference on Learning Representations, 2021.
- Robust principal component analysis? J. ACM, 58(3), Jun 2011.
- Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, volume 26, 2013.
- Matching networks for one shot learning. In Advances in Neural Information Processing Systems, volume 29, 2016.
- Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1126–1135, 06–11 Aug 2017.
- Delta-encoder: an effective sample synthesis method for few-shot object recognition. In Advances in Neural Information Processing Systems, volume 31, 2018.
- LGM-net: Learning to generate matching networks for few-shot learning. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 3825–3834, 09–15 Jun 2019.
- Revisiting local descriptor based image-to-class measure for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Model compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06, page 535–541, 2006.
- Do deep nets really need to be deep? In Advances in Neural Information Processing Systems, volume 27, 2014.
- Distilling the Knowledge in a Neural Network. arXiv e-prints, page arXiv:1503.02531, March 2015.
- A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
- Paraphrasing complex network: Network compression via factor transfer. In Advances in Neural Information Processing Systems, volume 31, 2018.
- A comprehensive overhaul of feature distillation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
- Born again neural networks. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1607–1616, 10–15 Jul 2018.
- Self-distillation as instance-specific label smoothing. In Advances in Neural Information Processing Systems, volume 33, pages 2184–2195, 2020.
- The earth mover”s distance, multidimensional scaling, and color-based image retrieval. Proceedings of the Arpa Image Understanding Workshop, 1997.
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Bootstrap your own latent - a new approach to self-supervised learning. In Advances in Neural Information Processing Systems, volume 33, pages 21271–21284, 2020.
- Interventional few-shot learning. In Advances in Neural Information Processing Systems, volume 33, pages 2734–2746, 2020.
- Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2):343–348, 1967.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
- Meta-learning for semi-supervised few-shot classification. In Proceedings of 6th International Conference on Learning Representations ICLR, 2018.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Few-shot learning via embedding adaptation with set-to-set functions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Eckpn: Explicit class knowledge propagation network for transductive few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6596–6605, June 2021.
- Attribute-guided dynamic routing graph network for transductive few-shot learning. In Proceedings of the 30th ACM International Conference on Multimedia, MM ’22, page 6259–6268, 2022.
- Attributes-guided and pure-visual attention alignment for few-shot recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9):7840–7847, May 2021.
- Tadam: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing Systems, volume 31, 2018.
- Partner-assisted learning for few-shot image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10573–10582, October 2021.
- Fads: Fourier-augmentation based data-shunting for few-shot classification. IEEE Transactions on Circuits and Systems for Video Technology, pages 1–1, 2023.
- Associative alignment for few-shot image classification. In European Conference on Computer Vision (ECCV), pages 18–35. Springer, 2020.
- Transductive few-shot classification on the oblique manifold. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 8412–8422, October 2021.
- Rethinking generalization in few-shot classification. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008.
- A closer look at few-shot classification. In International Conference on Learning Representations, 2019.
- Few-shot learning through an information retrieval lens. In Advances in Neural Information Processing Systems, volume 30, 2017.
- On the scaling of multidimensional matrices. Linear Algebra and its Applications, 114-115:717–735, 1989.
- Philip A. Knight. The sinkhorn–knopp algorithm: Convergence and applications. SIAM Journal on Matrix Analysis and Applications, 30(1):261–275, 2008.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.