Extend the BEoS generalization framework beyond two-layer ReLU networks
Extend the theoretical framework that analyzes gradient-descent solutions below the Edge-of-Stability via data-dependent weighted path/variation norms for two-layer fully-connected ReLU networks to deeper networks or architectures with specific inductive biases, such as convolutional neural networks.
References
Our theoretical results are derived for two-layer fully-connected ReLU networks, a cornerstone for theoretical analysis. Extending this framework to deeper networks or architectures with specific inductive biases (e.g., CNNs) is a significant undertaking that we leave for future work.
— Generalization Below the Edge of Stability: The Role of Data Geometry
(2510.18120 - Liang et al., 20 Oct 2025) in Introduction – Scope of Analysis