Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
116 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
24 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
35 tokens/sec
2000 character limit reached

Law of large numbers and central limit theorem for wide two-layer neural networks: the mini-batch and noisy case (2207.12734v2)

Published 26 Jul 2022 in math.PR

Abstract: In this work, we consider a wide two-layer neural network and study the behavior of its empirical weights under a dynamics set by a stochastic gradient descent along the quadratic loss with mini-batches and noise. Our goal is to prove a trajectorial law of large number as well as a central limit theorem for their evolution. When the noise is scaling as 1/N $\beta$ and 1/2 < $\beta$ $\le$ $\infty$, we rigorously derive and generalize the LLN obtained for example in [CRBVE20, MMM19, SS20b]. When 3/4 < $\beta$ $\le$ $\infty$, we also generalize the CLT (see also [SS20a]) and further exhibit the effect of mini-batching on the asymptotic variance which leads the fluctuations. The case $\beta$ = 3/4 is trickier and we give an example showing the divergence with time of the variance thus establishing the instability of the predictions of the neural network in this case. It is illustrated by simple numerical examples.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.