Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators (2112.13867v1)
Abstract: We construct pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}d$ such that the quantity $|\mathbb{E}{x \sim \mu_d} [F(x)] - \mathbb{E}{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/d2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to distinguish distributions that shallow discriminators cannot. Analogously, we build pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}d$ such that $|\mathbb{E}{x \sim \mu_d} [F(x)] - \mathbb{E}{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/(d\log d))$ for two-layer ReLU networks with polynomial weights, while declining exponentially for bounded-norm functions in the associated RKHS. This confirms that feature learning is beneficial for discriminators. Our bounds are based on Fourier transforms.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.