Hyperspherical Classification with Dynamic Label-to-Prototype Assignment (2403.16937v1)
Abstract: Aiming to enhance the utilization of metric space by the parametric softmax classifier, recent studies suggest replacing it with a non-parametric alternative. Although a non-parametric classifier may provide better metric space utilization, it introduces the challenge of capturing inter-class relationships. A shared characteristic among prior non-parametric classifiers is the static assignment of labels to prototypes during the training, ie, each prototype consistently represents a class throughout the training course. Orthogonal to previous works, we present a simple yet effective method to optimize the category assigned to each prototype (label-to-prototype assignment) during the training. To this aim, we formalize the problem as a two-step optimization objective over network parameters and label-to-prototype assignment mapping. We solve this optimization using a sequential combination of gradient descent and Bipartide matching. We demonstrate the benefits of the proposed approach by conducting experiments on balanced and long-tail classification problems using different backbone network architectures. In particular, our method outperforms its competitors by 1.22\% accuracy on CIFAR-100, and 2.15\% on ImageNet-200 using a metric space dimension half of the size of its competitors. Code: https://github.com/msed-Ebrahimi/DL2PA_CVPR24
- Partial fc: Training 10 million identities on a single machine. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1445–1449, 2021.
- Killing two birds with one stone: Efficient and robust training of face recognition cnns by partial fc. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4042–4051, 2022.
- Equiangular lines and spherical codes in Euclidean space. Inventiones mathematicae, 211(1):179–212, 2018.
- Recognition in terra incognita. In Proceedings of the European conference on computer vision, pages 456–473, 2018.
- Salomon Bochner. Monotone funktionen, stieltjessche integrale und harmonische analyse. Mathematische Annalen, 108(1):378–410, 1933.
- Unsupervised learning by predicting noise. In International Conference on Machine Learning, pages 517–526. PMLR, 2017.
- Discrete energy on rectifiable sets. Springer, 2019.
- Universally optimal distribution of points on spheres. Journal of the American Mathematical Society, 20(1):99–148, 2007.
- Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9268–9277, 2019.
- Variational prototype learning for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11906–11915, 2021.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Uniformface: Learning deep equidistributed representation for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3415–3424, 2019.
- Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training. Proceedings of the National Academy of Sciences, 118(43):e2103091118, 2021.
- Hyperbolic busemann learning with ideal prototypes. Advances in Neural Information Processing Systems, 34:103–115, 2021.
- Joshua Goodman. Classes for fast maximum entropy training. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), pages 561–564. IEEE, 2001.
- Learning intra-class multimodal distributions with orthonormal matrices. In WACV, pages 1870–1879, 2024.
- Deepncm: Deep nearest class mean classifiers. 2018.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
- Fix your classifier: the marginal value of training the last weight layer. arXiv preprint arXiv:1801.04540, 2018.
- Vivo: Visual vocabulary pre-training for novel object captioning. In proceedings of the AAAI conference on artificial intelligence, pages 1575–1583, 2021.
- Prototypical priors: From improving classification to zero-shot learning. arXiv preprint arXiv:1512.01192, 2015.
- Maximum class separation as inductive bias in one matrix. Advances in Neural Information Processing Systems, 35:19553–19566, 2022.
- Striking the right balance with uncertainty. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 103–112, 2019.
- Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
- Distributed representations of sentences and documents. In International conference on machine learning, pages 1188–1196. PMLR, 2014.
- Deep learning. nature, 521(7553):436–444, 2015.
- Prototype adjustment for zero shot classification. Signal Processing: Image Communication, 74:242–252, 2019.
- End-to-end lane shape prediction with transformers. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 3694–3702, 2021a.
- Learning towards minimum hyperspherical energy. Advances in neural information processing systems, 31, 2018a.
- Decoupled networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2771–2779, 2018b.
- Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2537–2546, 2019.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021b.
- Neural collapse with cross-entropy loss. arXiv preprint arXiv:2012.08465, 2020.
- Exponentially weighted moving average control schemes: properties and enhancements. Technometrics, 32(1):1–12, 1990.
- Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE transactions on pattern analysis and machine intelligence, 35(11):2624–2637, 2013.
- Hyperspherical prototype networks. Advances in neural information processing systems, 32, 2019.
- Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
- No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision, pages 360–368, 2017.
- Rethinking softmax cross-entropy loss for adversarial robustness. arXiv preprint arXiv:1905.10626, 2019.
- Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117(40):24652–24663, 2020.
- Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9):2352–2449, 2017.
- Distributing many points on a sphere. The mathematical intelligencer, 19:5–11, 1997.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
- Equiangular basis vectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11755–11765, 2023.
- On the existence of equiangular tight frames. Linear Algebra and its applications, 426(2-3):619–635, 2007.
- Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR, 2020.
- Exploring cross-image pixel contrast for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7303–7313, 2021.
- Visual recognition with deep nearest centroids. In The Eleventh International Conference on Learning Representations, 2022.
- A discriminative feature learning approach for deep face recognition. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, pages 499–515. Springer, 2016.
- Convolutional neural networks with alternately updated clique. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2413–2422, 2018.
- Inducing neural collapse in imbalanced learning: Do we really need a learnable classifier at the end of deep neural network? Advances in Neural Information Processing Systems, 35:37991–38002, 2022.
- Neural collapse inspired feature-classifier alignment for few-shot class incremental learning. arXiv preprint arXiv:2302.03004, 2023.
- A geometric analysis of neural collapse with unconstrained features. Advances in Neural Information Processing Systems, 34:29820–29834, 2021.
- Mohammad Saeed Ebrahimi Saadabadi (12 papers)
- Ali Dabouei (36 papers)
- Sahar Rahimi Malakshan (10 papers)
- Nasser M. Nasrabad (1 paper)