Sparsity-aware generalization theory for deep neural networks (2307.00426v2)
Abstract: Deep artificial neural networks achieve surprising generalization abilities that remain poorly understood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.
- Pierre Alquier. User-friendly introduction to pac-bayes bounds. ArXiv, abs/2110.11216, 2021.
- Stronger generalization bounds for deep nets via a compression approach. In ICML, 2018.
- De-randomized pac-bayes margin bounds: Applications to non-convex and non-smooth predictors. ArXiv, abs/2002.09956, 2020.
- Spectrally-normalized margin bounds for neural networks. Advances in neural information processing systems, 30, 2017.
- Nearly-tight vc-dimension and pseudodimension bounds for piecewise linear neural networks. J. Mach. Learn. Res., 20:63:1–63:17, 2019.
- Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data. ArXiv, abs/1703.11008, 2017.
- Entropy-sgd optimizes the prior of a pac-bayes bound: Data-dependent pac-bayes priors via differential privacy. In NeurIPS, 2018.
- Sharpness-aware minimization for efficiently improving generalization. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=6Tm1mposlrM.
- Norm-based generalization bounds for compositionally sparse neural networks. arXiv preprint arXiv:2301.12033, 2023.
- Size-independent sample complexity of neural networks. In Conference On Learning Theory, pages 297–299. PMLR, 2018.
- Benjamin Guedj. A primer on PAC-Bayesian learning. In Proceedings of the second congress of the French Mathematical Society, volume 33, 2019. URL https://arxiv.org/abs/1901.05353.
- Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
- Fantastic generalization measures and where to find them. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=SJgIPJBFvH.
- On large-batch training for deep learning: Generalization gap and sharp minima. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=H1oyRlYgg.
- (not) bounding the true error. Advances in Neural Information Processing Systems, 14, 2001.
- On tighter generalization bound for deep neural networks: Cnns, resnets, and beyond. ArXiv, abs/1806.05159, 2018.
- David A McAllester. Some pac-bayesian theorems. In Proceedings of the eleventh annual conference on Computational learning theory, pages 230–234, 1998.
- Adversarial robustness of sparse local lipschitz predictors, 2022. URL https://arxiv.org/abs/2202.13216.
- Uniform convergence may be unable to explain generalization in deep learning. Advances in Neural Information Processing Systems, 32, 2019a.
- Deterministic PAC-bayesian generalization bounds for deep networks via generalizing noise-resilience. In International Conference on Learning Representations, 2019b. URL https://openreview.net/forum?id=Hygn2o0qKX.
- Norm-based capacity control in neural networks. In Conference on learning theory, pages 1376–1401. PMLR, 2015.
- Exploring generalization in deep learning. In NIPS, 2017.
- A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Skz_WfbCZ.
- The role of over-parametrization in generalization of neural networks. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=BygfghAcYX.
- Generalization bounds for deep learning. ArXiv, abs/2012.04115, 2020.
- Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
- A pac analysis of a bayesian estimator. In Proceedings of the tenth annual conference on Computational learning theory, pages 2–9, 1997.
- Adversarial robustness of supervised sparse coding. Advances in neural information processing systems, 33:2110–2121, 2020.
- Improved sparse approximation over quasiincoherent dictionaries. In Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), volume 1, pages I–37. IEEE, 2003.
- Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019. doi: 10.1017/9781108627771.
- Data-dependent sample complexity of deep neural networks via lipschitz augmentation. In NeurIPS, 2019.
- Improved sample complexities for deep neural networks and robust classification via an all-layer margin. In ICLR, 2020.
- Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=Sy8gdB9xx.
- Non-vacuous generalization bounds at the imagenet scale: a pac-bayesian compression approach. In ICLR, 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.