Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 165 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 25 tok/s Pro

GPT-5 High 26 tok/s Pro

GPT-4o 81 tok/s Pro

Kimi K2 189 tok/s Pro

GPT OSS 120B 445 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Sparsity-aware generalization theory for deep neural networks (2307.00426v2)

Published 1 Jul 2023 in cs.LG and cs.AI

Abstract: Deep artificial neural networks achieve surprising generalization abilities that remain poorly understood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.

References (34)