Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization (2403.07264v1)

Published 12 Mar 2024 in stat.ML and cs.LG

Abstract: We study the generalization capability of nearly-interpolating linear regressors: $\boldsymbol{\beta}$'s whose training error $\tau$ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix $\boldsymbol{\Sigma}$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $\tau$ fixed, $\boldsymbol{\beta}$ has squared $\ell_2$-norm $\mathbb{E}[|{\boldsymbol{\beta}}|_{2}^{2}] = \Omega(n^{\alpha})$ where $n$ is the number of samples and $\alpha >1$ is the exponent of the eigendecay, i.e., $\lambda_i(\boldsymbol{\Sigma}) \sim i^{-\alpha}$. This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents $\alpha$ correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.

References (50)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/StatMLPapers/status/1767763469133074609

https://twitter.com/michigan_AI/status/1833486581648220453

https://twitter.com/firoozye/status/1767819544880447496

https://twitter.com/RishiSonthalia/status/1768148471343968745

Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization (2403.07264v1)

Summary

Related Papers

Tweets