Predictive power of a Bayesian effective action for fully-connected one hidden layer neural networks in the proportional limit (2401.11004v1)

Published 19 Jan 2024 in cond-mat.dis-nn and cond-mat.stat-mech

Abstract: We perform accurate numerical experiments with fully-connected (FC) one-hidden layer neural networks trained with a discretized Langevin dynamics on the MNIST and CIFAR10 datasets. Our goal is to empirically determine the regimes of validity of a recently-derived Bayesian effective action for shallow architectures in the proportional limit. We explore the predictive power of the theory as a function of the parameters (the temperature $T$, the magnitude of the Gaussian priors $\lambda_1$, $\lambda_0$, the size of the hidden layer $N_1$ and the size of the training set $P$) by comparing the experimental and predicted generalization error. The very good agreement between the effective theory and the experiments represents an indication that global rescaling of the infinite-width kernel is a main physical mechanism for kernel renormalization in FC Bayesian standard-scaled shallow networks.

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Predictive power of a Bayesian effective action for fully-connected one hidden layer neural networks in the proportional limit (2401.11004v1)

Summary

Related Papers