Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations (2306.17105v1)

Published 29 Jun 2023 in cs.LG

Abstract: Recent work has observed an intriguing ''Neural Collapse'' phenomenon in well-trained neural networks, where the last-layer representations of training samples with the same label collapse into each other. This appears to suggest that the last-layer representations are completely determined by the labels, and do not depend on the intrinsic structure of input distribution. We provide evidence that this is not a complete description, and that the apparent collapse hides important fine-grained structure in the representations. Specifically, even when representations apparently collapse, the small amount of remaining variation can still faithfully and accurately captures the intrinsic structure of input distribution. As an example, if we train on CIFAR-10 using only 5 coarse-grained labels (by combining two classes into one super-class) until convergence, we can reconstruct the original 10-class labels from the learned representations via unsupervised clustering. The reconstructed labels achieve $93\%$ accuracy on the CIFAR-10 test set, nearly matching the normal CIFAR-10 accuracy for the same architecture. We also provide an initial theoretical result showing the fine-grained representation structure in a simplified synthetic setting. Our results show concretely how the structure of input data can play a significant role in determining the fine-grained structure of neural representations, going beyond what Neural Collapse predicts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Towards understanding ensemble, knowledge distillation and self-distillation in deep learning. arXiv preprint arXiv:2012.09816, 2020.
  2. Concentration inequalities and martingale inequalities: a survey. Internet mathematics, 3(1):79–127, 2006.
  3. Revealing the structure of deep neural networks via convex duality. In International Conference on Machine Learning, pp. 3004–3014. PMLR, 2021.
  4. Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training. Proceedings of the National Academy of Sciences, 118(43):e2103091118, 2021.
  5. On the role of neural collapse in transfer learning. arXiv preprint arXiv:2112.15121, 2021.
  6. Neural collapse under mse loss: Proximity to and dynamics on the central path. arXiv preprint arXiv:2106.02073, 2021.
  7. Limitations of neural collapse for understanding generalization in deep learning. arXiv preprint arXiv:2202.08384, 2022.
  8. An unconstrained layer-peeled perspective on neural collapse. arXiv preprint arXiv:2110.02796, 2021.
  9. Linguistic regularities in sparse and explicit word representations. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp.  171–180, Ann Arbor, Michigan, June 2014. Association for Computational Linguistics. doi: 10.3115/v1/W14-1618. URL https://aclanthology.org/W14-1618.
  10. Neural collapse with cross-entropy loss. arXiv preprint arXiv:2012.08465, 2020.
  11. McKenna, R. Tail bounds on the sum of half normal random variables. http://www.ryanhmckenna.com/2021/12/tail-bounds-on-sum-of-half-normal.html.
  12. Neural collapse with unconstrained features. Sampling Theory, Signal Processing, and Data Analysis, 20(2):1–13, 2022.
  13. Feature visualization. Distill, 2017. doi: 10.23915/distill.00007. https://distill.pub/2017/feature-visualization.
  14. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117(40):24652–24663, 2020.
  15. Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985.
  16. Data augmentation as feature manipulation. In International Conference on Machine Learning, pp. 19773–19808. PMLR, 2022.
  17. No subclass left behind: Fine-grained robustness in coarse-grained classification problems. Advances in Neural Information Processing Systems, 33:19339–19352, 2020.
  18. Extended unconstrained features model for exploring deep neural collapse. arXiv preprint arXiv:2202.08087, 2022.
  19. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  20. Learning with symmetric label noise: The importance of being unhinged. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp.  10–18, 2015.
  21. On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers. In Mathematical and Scientific Machine Learning, pp. 270–290. PMLR, 2022.
  22. Wojtowytsch, S. et al. On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers. arXiv preprint arXiv:2012.05420, 2020.
  23. Neural collapse with normalized features: A geometric analysis over the riemannian manifold. arXiv preprint arXiv:2209.09211, 2022.
  24. On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features. arXiv preprint arXiv:2203.01238, 2022.
  25. A geometric analysis of neural collapse with unconstrained features. Advances in Neural Information Processing Systems, 34:29820–29834, 2021.
  26. Understanding the generalization of adam in learning neural networks with proper regularization. arXiv preprint arXiv:2108.11371, 2021.
Citations (8)

Summary

We haven't generated a summary for this paper yet.