Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Sparsity-aware generalization theory for deep neural networks (2307.00426v2)

Published 1 Jul 2023 in cs.LG and cs.AI

Abstract: Deep artificial neural networks achieve surprising generalization abilities that remain poorly understood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Pierre Alquier. User-friendly introduction to pac-bayes bounds. ArXiv, abs/2110.11216, 2021.
  2. Stronger generalization bounds for deep nets via a compression approach. In ICML, 2018.
  3. De-randomized pac-bayes margin bounds: Applications to non-convex and non-smooth predictors. ArXiv, abs/2002.09956, 2020.
  4. Spectrally-normalized margin bounds for neural networks. Advances in neural information processing systems, 30, 2017.
  5. Nearly-tight vc-dimension and pseudodimension bounds for piecewise linear neural networks. J. Mach. Learn. Res., 20:63:1–63:17, 2019.
  6. Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data. ArXiv, abs/1703.11008, 2017.
  7. Entropy-sgd optimizes the prior of a pac-bayes bound: Data-dependent pac-bayes priors via differential privacy. In NeurIPS, 2018.
  8. Sharpness-aware minimization for efficiently improving generalization. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=6Tm1mposlrM.
  9. Norm-based generalization bounds for compositionally sparse neural networks. arXiv preprint arXiv:2301.12033, 2023.
  10. Size-independent sample complexity of neural networks. In Conference On Learning Theory, pages 297–299. PMLR, 2018.
  11. Benjamin Guedj. A primer on PAC-Bayesian learning. In Proceedings of the second congress of the French Mathematical Society, volume 33, 2019. URL https://arxiv.org/abs/1901.05353.
  12. Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
  13. Fantastic generalization measures and where to find them. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=SJgIPJBFvH.
  14. On large-batch training for deep learning: Generalization gap and sharp minima. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=H1oyRlYgg.
  15. (not) bounding the true error. Advances in Neural Information Processing Systems, 14, 2001.
  16. On tighter generalization bound for deep neural networks: Cnns, resnets, and beyond. ArXiv, abs/1806.05159, 2018.
  17. David A McAllester. Some pac-bayesian theorems. In Proceedings of the eleventh annual conference on Computational learning theory, pages 230–234, 1998.
  18. Adversarial robustness of sparse local lipschitz predictors, 2022. URL https://arxiv.org/abs/2202.13216.
  19. Uniform convergence may be unable to explain generalization in deep learning. Advances in Neural Information Processing Systems, 32, 2019a.
  20. Deterministic PAC-bayesian generalization bounds for deep networks via generalizing noise-resilience. In International Conference on Learning Representations, 2019b. URL https://openreview.net/forum?id=Hygn2o0qKX.
  21. Norm-based capacity control in neural networks. In Conference on learning theory, pages 1376–1401. PMLR, 2015.
  22. Exploring generalization in deep learning. In NIPS, 2017.
  23. A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Skz_WfbCZ.
  24. The role of over-parametrization in generalization of neural networks. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=BygfghAcYX.
  25. Generalization bounds for deep learning. ArXiv, abs/2012.04115, 2020.
  26. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
  27. A pac analysis of a bayesian estimator. In Proceedings of the tenth annual conference on Computational learning theory, pages 2–9, 1997.
  28. Adversarial robustness of supervised sparse coding. Advances in neural information processing systems, 33:2110–2121, 2020.
  29. Improved sparse approximation over quasiincoherent dictionaries. In Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), volume 1, pages I–37. IEEE, 2003.
  30. Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019. doi: 10.1017/9781108627771.
  31. Data-dependent sample complexity of deep neural networks via lipschitz augmentation. In NeurIPS, 2019.
  32. Improved sample complexities for deep neural networks and robust classification via an all-layer margin. In ICLR, 2020.
  33. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=Sy8gdB9xx.
  34. Non-vacuous generalization bounds at the imagenet scale: a pac-bayesian compression approach. In ICLR, 2019.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.