Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A mean-field limit for certain deep neural networks (1906.00193v1)

Published 1 Jun 2019 in math.ST, cond-mat.dis-nn, math.PR, and stat.TH

Abstract: Understanding deep neural networks (DNNs) is a key challenge in the theory of machine learning, with potential applications to the many fields where DNNs have been successfully used. This article presents a scaling limit for a DNN being trained by stochastic gradient descent. Our networks have a fixed (but arbitrary) number $L\geq 2$ of inner layers; $N\gg 1$ neurons per layer; full connections between layers; and fixed weights (or "random features" that are not trained) near the input and output. Our results describe the evolution of the DNN during training in the limit when $N\to +\infty$, which we relate to a mean field model of McKean-Vlasov type. Specifically, we show that network weights are approximated by certain "ideal particles" whose distribution and dependencies are described by the mean-field model. A key part of the proof is to show existence and uniqueness for our McKean-Vlasov problem, which does not seem to be amenable to existing theory. Our paper extends previous work on the $L=1$ case by Mei, Montanari and Nguyen; Rotskoff and Vanden-Eijnden; and Sirignano and Spiliopoulos. We also complement recent independent work on $L>1$ by Sirignano and Spiliopoulos (who consider a less natural scaling limit) and Nguyen (who nonrigorously derives similar results).

Citations (68)

Summary

We haven't generated a summary for this paper yet.