2000 character limit reached
Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks (2305.10550v1)
Published 17 May 2023 in cs.LG, cond-mat.dis-nn, and q-bio.NC
Abstract: We investigate how sparse neural activity affects the generalization performance of a deep Bayesian neural network at the large width limit. To this end, we derive a neural network Gaussian Process (NNGP) kernel with rectified linear unit (ReLU) activation and a predetermined fraction of active neurons. Using the NNGP kernel, we observe that the sparser networks outperform the non-sparse networks at shallow depths on a variety of datasets. We validate this observation by extending the existing theory on the generalization error of kernel-ridge regression.