General conditions for Gaussian equivalence in kernel methods

Establish general, verifiable conditions on data distributions and kernels under which Gaussian equivalence holds for kernel methods, meaning that the generalization error and deterministic equivalents for non-Gaussian covariates coincide with those derived for Gaussian covariates with matched first and second moments. Characterize regimes and specific kernel–data pairs where Gaussian equivalence fails.

Background

The paper leverages Gaussian equivalence to extend linear regression results to kernel regression: in high-dimensional proportional limits, non-Gaussian data often behave like Gaussian data with matched covariance. While rigorous results exist for specific settings—such as dot-product kernels and particular input distributions—general applicability across kernels, data distributions, and regimes remains uncertain.

A comprehensive characterization of when Gaussian equivalence holds would unify disparate results and strengthen the theoretical foundations of kernel methods and random feature models. It would also delineate low-dimensional or structured scenarios where equivalence breaks, informing both model design and analysis.

References

It is not obvious when Gaussian equivalence should hold for general kernel methods; some sufficient conditions are obtained in very recent work of Misiakiewicz and Saeed .

— Scaling and renormalization in high-dimensional regression (2405.00592 - Atanasov et al., 2024) in Section 5.2 Connection to Kernel Regression via Gaussian Universality

General conditions for Gaussian equivalence in kernel methods

Sponsor

Background

References

Related Problems