Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hyperspherical Classification with Dynamic Label-to-Prototype Assignment (2403.16937v1)

Published 25 Mar 2024 in cs.CV

Abstract: Aiming to enhance the utilization of metric space by the parametric softmax classifier, recent studies suggest replacing it with a non-parametric alternative. Although a non-parametric classifier may provide better metric space utilization, it introduces the challenge of capturing inter-class relationships. A shared characteristic among prior non-parametric classifiers is the static assignment of labels to prototypes during the training, ie, each prototype consistently represents a class throughout the training course. Orthogonal to previous works, we present a simple yet effective method to optimize the category assigned to each prototype (label-to-prototype assignment) during the training. To this aim, we formalize the problem as a two-step optimization objective over network parameters and label-to-prototype assignment mapping. We solve this optimization using a sequential combination of gradient descent and Bipartide matching. We demonstrate the benefits of the proposed approach by conducting experiments on balanced and long-tail classification problems using different backbone network architectures. In particular, our method outperforms its competitors by 1.22\% accuracy on CIFAR-100, and 2.15\% on ImageNet-200 using a metric space dimension half of the size of its competitors. Code: https://github.com/msed-Ebrahimi/DL2PA_CVPR24

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Partial fc: Training 10 million identities on a single machine. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1445–1449, 2021.
  2. Killing two birds with one stone: Efficient and robust training of face recognition cnns by partial fc. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4042–4051, 2022.
  3. Equiangular lines and spherical codes in Euclidean space. Inventiones mathematicae, 211(1):179–212, 2018.
  4. Recognition in terra incognita. In Proceedings of the European conference on computer vision, pages 456–473, 2018.
  5. Salomon Bochner. Monotone funktionen, stieltjessche integrale und harmonische analyse. Mathematische Annalen, 108(1):378–410, 1933.
  6. Unsupervised learning by predicting noise. In International Conference on Machine Learning, pages 517–526. PMLR, 2017.
  7. Discrete energy on rectifiable sets. Springer, 2019.
  8. Universally optimal distribution of points on spheres. Journal of the American Mathematical Society, 20(1):99–148, 2007.
  9. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9268–9277, 2019.
  10. Variational prototype learning for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11906–11915, 2021.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  12. Uniformface: Learning deep equidistributed representation for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3415–3424, 2019.
  13. Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training. Proceedings of the National Academy of Sciences, 118(43):e2103091118, 2021.
  14. Hyperbolic busemann learning with ideal prototypes. Advances in Neural Information Processing Systems, 34:103–115, 2021.
  15. Joshua Goodman. Classes for fast maximum entropy training. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), pages 561–564. IEEE, 2001.
  16. Learning intra-class multimodal distributions with orthonormal matrices. In WACV, pages 1870–1879, 2024.
  17. Deepncm: Deep nearest class mean classifiers. 2018.
  18. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  19. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  20. Fix your classifier: the marginal value of training the last weight layer. arXiv preprint arXiv:1801.04540, 2018.
  21. Vivo: Visual vocabulary pre-training for novel object captioning. In proceedings of the AAAI conference on artificial intelligence, pages 1575–1583, 2021.
  22. Prototypical priors: From improving classification to zero-shot learning. arXiv preprint arXiv:1512.01192, 2015.
  23. Maximum class separation as inductive bias in one matrix. Advances in Neural Information Processing Systems, 35:19553–19566, 2022.
  24. Striking the right balance with uncertainty. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 103–112, 2019.
  25. Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
  26. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  27. Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
  28. Distributed representations of sentences and documents. In International conference on machine learning, pages 1188–1196. PMLR, 2014.
  29. Deep learning. nature, 521(7553):436–444, 2015.
  30. Prototype adjustment for zero shot classification. Signal Processing: Image Communication, 74:242–252, 2019.
  31. End-to-end lane shape prediction with transformers. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 3694–3702, 2021a.
  32. Learning towards minimum hyperspherical energy. Advances in neural information processing systems, 31, 2018a.
  33. Decoupled networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2771–2779, 2018b.
  34. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2537–2546, 2019.
  35. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021b.
  36. Neural collapse with cross-entropy loss. arXiv preprint arXiv:2012.08465, 2020.
  37. Exponentially weighted moving average control schemes: properties and enhancements. Technometrics, 32(1):1–12, 1990.
  38. Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE transactions on pattern analysis and machine intelligence, 35(11):2624–2637, 2013.
  39. Hyperspherical prototype networks. Advances in neural information processing systems, 32, 2019.
  40. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
  41. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision, pages 360–368, 2017.
  42. Rethinking softmax cross-entropy loss for adversarial robustness. arXiv preprint arXiv:1905.10626, 2019.
  43. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117(40):24652–24663, 2020.
  44. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9):2352–2449, 2017.
  45. Distributing many points on a sphere. The mathematical intelligencer, 19:5–11, 1997.
  46. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
  47. Equiangular basis vectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11755–11765, 2023.
  48. On the existence of equiangular tight frames. Linear Algebra and its applications, 426(2-3):619–635, 2007.
  49. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR, 2020.
  50. Exploring cross-image pixel contrast for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7303–7313, 2021.
  51. Visual recognition with deep nearest centroids. In The Eleventh International Conference on Learning Representations, 2022.
  52. A discriminative feature learning approach for deep face recognition. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, pages 499–515. Springer, 2016.
  53. Convolutional neural networks with alternately updated clique. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2413–2422, 2018.
  54. Inducing neural collapse in imbalanced learning: Do we really need a learnable classifier at the end of deep neural network? Advances in Neural Information Processing Systems, 35:37991–38002, 2022.
  55. Neural collapse inspired feature-classifier alignment for few-shot class incremental learning. arXiv preprint arXiv:2302.03004, 2023.
  56. A geometric analysis of neural collapse with unconstrained features. Advances in Neural Information Processing Systems, 34:29820–29834, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com