Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 73 tok/s

Gemini 2.5 Pro 51 tok/s Pro

GPT-5 Medium 31 tok/s Pro

GPT-5 High 32 tok/s Pro

GPT-4o 103 tok/s Pro

Kimi K2 218 tok/s Pro

GPT OSS 120B 460 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

On the Robustness of Neural Collapse and the Neural Collapse of Robustness (2311.07444v2)

Published 13 Nov 2023 in cs.LG

Abstract: Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex). While it has been observed empirically in various cases and has been theoretically motivated, its connection with crucial properties of neural networks, like their generalization and robustness, remains unclear. In this work, we study the stability properties of these simplices. We find that the simplex structure disappears under small adversarial attacks, and that perturbed examples "leap" between simplex vertices. We further analyze the geometry of networks that are optimized to be robust against adversarial perturbations of the input, and find that Neural Collapse is a pervasive phenomenon in these cases as well, with clean and perturbed representations forming aligned simplices, and giving rise to a robust simple nearest-neighbor classifier. By studying the propagation of the amount of collapse inside the network, we identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data. Our code is available at https://github.com/JingtongSu/robust_neural_collapse .

Citations (3)

View on Semantic Scholar

Summary

The paper demonstrates that adversarial attacks disrupt the simplex structure inherent in neural collapse in standard networks, highlighting its fragility.
It shows that adversarially trained networks maintain aligned simplex geometries on both clean and perturbed data, enhancing robustness.
Furthermore, the study reveals that early network layers retain robust geometric properties, suggesting new avenues for designing resilient architectures.

Exploring the Stability of Neural Collapse Under Adversarial Attacks

Introduction to Neural Collapse

Neural Collapse (NC) is an empirical phenomenon observed in the late stages of training a deep neural network, characterized by a simplification in the geometry of the learned feature space. Specifically, the feature representations of data points from the same class and the corresponding class weights in the last layer of a network align to form a Simplex Equiangular Tight Frame (ETF), leading to a compact and highly structured geometric arrangement. This phenomenon intriguingly suggests a potential intrinsic property of neural networks, driving researchers to further investigate its implications for network performance, including generalization and robustness.

Robustness of Neural Collapse

This paper embarks on understanding the robustness of the neural collapse phenomenon, particularly its resilience against adversarial attacks – perturbations designed to deceive the network into making incorrect predictions. Through a meticulous series of experiments, the paper delineates how adversarial perturbations disrupt the simplex structure characteristic of neural collapse, effectively "leaping" representations between simplex vertices. Notably, this disruption hints at the fragile nature of the simplex arrangement under adversarial conditions for standardly trained networks.

Neural Collapse in Adversarially Trained Networks

Contrastingly, the investigation reveals that adversarially trained networks, designed to be robust to such perturbations, also exhibit the neural collapse phenomenon. These networks form aligned simplices for both clean and perturbed data, thereby fostering a robust simple nearest-neighbor classifier. The findings suggest that adversarial training does not remove the phenomenon but rather adapts it, resulting in a more robust manifestation of neural collapse. This not only extends the relevancy of neural collapse to robust architectures but also points to its intriguing adaptability under different training paradigms.

Early Layer Stability and Robustness

A novel angle explored in this work is the role of earlier network layers in maintaining the neural collapse structure under adversarial conditions. The paper observes that, unlike the later layers, earlier layers retain a reliable simplex structure even for perturbed data. This insight opens new avenues for understanding the propagation of neural collapse throughout the network and its implications for robustness. It appears that earlier layers could potentially harbor innate robust properties, a hypothesis that could shape future strategies for designing adversarially robust networks.

Future Directions and Theoretical Implications

The findings from this extensive empirical investigation pivot towards a more nuanced understanding of neural collapse, primarily under adversarial settings. The observed phenomena underline the need for theoretical frameworks capable of explaining the persistence and adaptability of neural collapse across varying training regimes and its implications for network robustness. Future work could aim to bridge the gap between these empirical observations and theoretical models to fully elucidate the role of neural collapse in the broader context of deep learning.

Concluding Remarks

The paper elegantly demonstrates the nuanced behavior of neural collapse under adversarial attacks and robust training protocols. By illuminating the fragility of simplex structures in standardly trained networks and their resilience in adversarially trained counterparts, it prompts a reevaluation of neural collapse's role in network architecture design and training strategy formulation. Moreover, the unexpected robustness properties of early layers offer promising directions for further research into inherently robust network structures. In sum, this paper contributes significantly to our understanding of the geometry of deep learning, with potential long-term implications for creating more robust and generalizable neural network models.