Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models (2103.05243v3)

Published 9 Mar 2021 in cs.LG, math.ST, stat.ML, and stat.TH

Abstract: In this paper, we study the generalization performance of min $\ell_2$-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameterized linear models with simple Fourier or Gaussian features. Specifically, for a class of learnable functions, we provide a new upper bound of the generalization error that approaches a small limiting value, even when the number of neurons $p$ approaches infinity. This limiting value further decreases with the number of training samples $n$. For functions outside of this class, we provide a lower bound on the generalization error that does not diminish to zero even when $n$ and $p$ are both large.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Peizhong Ju (17 papers)
  2. Xiaojun Lin (29 papers)
  3. Ness B. Shroff (88 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.