Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Emergence of Latent Binary Encoding in Deep Neural Network Classifiers (2310.08224v4)

Published 12 Oct 2023 in cs.LG

Abstract: We investigate the emergence of binary encoding within the latent space of deep-neural-network classifiers. Such binary encoding is induced by the introduction of a linear penultimate layer, which employs during training a loss function specifically designed to compress the latent representations. As a result of a trade-off between compression and information retention, the network learns to assume only one of two possible values for each dimension in the latent space. The binary encoding is provoked by the collapse of all representations of the same class to the same point, which corresponds to the vertex of a hypercube. By analyzing several datasets of increasing complexity, we provide empirical evidence that the emergence of binary encoding dramatically enhances robustness while also significantly improving the reliability and generalization of the network.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,”  (2015), arXiv:1503.02406 [cs.LG] .
  2. V. Kothapalli, “Neural collapse: A review on modelling principles and generalization,”  (2023), arXiv:2206.04041 [cs.LG] .
  3. Y. Yang, H. Yuan, X. Li, Z. Lin, P. Torr,  and D. Tao, “Neural collapse inspired feature-classifier alignment for few-shot class incremental learning,”  (2023), arXiv:2302.03004 [cs.CV] .
  4. J. Haas, W. Yolland,  and B. Rabus, “Linking neural collapse and l2 normalization with improved out-of-distribution detection in deep neural networks,”  (2023), arXiv:2209.08378 [cs.LG] .
  5. M. B. Ammar, N. Belkhir, S. Popescu, A. Manzanera,  and G. Franchi, “Neco: Neural collapse based out-of-distribution detection,”  (2023), arXiv:2310.06823 [stat.ML] .
  6. W. Ji, Y. Lu, Y. Zhang, Z. Deng,  and W. J. Su, “An unconstrained layer-peeled perspective on neural collapse,”  (2022), arXiv:2110.02796 [cs.LG] .
  7. T. Tirer and J. Bruna, “Extended unconstrained features model for exploring deep neural collapse,”  (2022), arXiv:2202.08087 [cs.LG] .
  8. T. Ergen and M. Pilanci, CoRR abs/2002.09773 (2020), 2002.09773 .
  9. J. Zhou, X. Li, T. Ding, C. You, Q. Qu,  and Z. Zhu, “On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features,”  (2022), arXiv:2203.01238 [cs.LG] .
  10. T. Tirer, H. Huang,  and J. Niles-Weed, “Perturbation analysis of neural collapse,”  (2023), arXiv:2210.16658 [cs.LG] .
  11. P. Ramachandran, B. Zoph,  and Q. V. Le, “Searching for activation functions,”  (2017), arXiv:1710.05941 [cs.NE] .
  12. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”  (2017), arXiv:1412.6980 [cs.LG] .
  13. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,”  (2019), arXiv:1711.05101 [cs.LG] .
  14. K. Lee, K. Lee, H. Lee,  and J. Shin, “A simple unified framework for detecting out-of-distribution samples and adversarial attacks,”  (2018), arXiv:1807.03888 [stat.ML] .
  15. C. S. Sastry and S. Oore, CoRR abs/1912.12510 (2019), 1912.12510 .
  16. Y. Sun, Y. Ming, X. Zhu,  and Y. Li, “Out-of-distribution detection with deep nearest neighbors,”  (2022), arXiv:2204.06507 [cs.LG] .
  17. J. Yang, P. Wang, D. Zou, Z. Zhou, K. Ding, W. Peng, H. Wang, G. Chen, B. Li, Y. Sun, X. Du, K. Zhou, W. Zhang, D. Hendrycks, Y. Li,  and Z. Liu, “Openood: Benchmarking generalized out-of-distribution detection,”  (2022), arXiv:2210.07242 [cs.CV] .
  18. S.-M. Moosavi-Dezfooli, A. Fawzi,  and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,”  (2016), arXiv:1511.04599 [cs.LG] .
  19. https://github.com/luigisbailo/emergence_binary_encoding.git.

Summary

We haven't generated a summary for this paper yet.