Dice Question Streamline Icon: https://streamlinehq.com

Finite-size correction δ in the NW estimator’s normalized mean-squared generalization error

Determine the correction term δ(α, β) that captures finite-size effects in the normalized mean-squared generalization error of the Nadaraya–Watson kernel smoothing estimator with spherical data, radial basis kernel k(x, x') = exp(β⟨x, x'⟩), and single-index targets f(x) = g(⟨w, x⟩/d) where g is positive-homogeneous of degree k, in the regime n = exp(α d) as d → ∞. Specifically, establish an explicit formula or rigorous bounds for δ(α, β) in the relation E_{D} E_{x}[(f(x) − f_{D}(x))^2] / E_{x}[f(x)^2] ∼ (1 − r_{*}(α, β)^k)^2 + δ(α, β), where r_{*}(α, β) maximizes φ(r) = α + β r + (1/2) log(1 − r^2) over r ∈ [0, sqrt(1 − e^{−2α})], yielding r_{*} = (sqrt(1 + 4β^2) − 1)/(2β) for β ≤ e^{2α} sqrt(1 − e^{−2α}) and r_{*} = sqrt(1 − e^{−2α}) otherwise.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper analyzes Nadaraya–Watson kernel smoothing in high dimensions using a Random Energy Model (REM) analogy for spherical data, with a radial basis kernel k(x, x') = exp(β⟨x, x'⟩) and single-index targets f(x) = g(⟨w, x⟩/d). A key result is a pointwise asymptotic prediction f_{D}(x) ∼ g(ρ r_{}), where r_{} = r_{*}(α, β) arises from a condensation-type REM analysis in the regime n = exp(α d).

To assess generalization, the authors consider the normalized mean-squared error scale induced by the distribution of overlaps ρ, and propose that for positive-homogeneous g of degree k, the leading contribution is a bias term (1 − r_{*}k)2. They introduce an additional term δ(α, β) to account for finite-size corrections needed at the normalization scale, but do not derive δ and explicitly mark this as conjectural, leaving its precise characterization open.

References

Accurately determining δ requires a more careful analysis than the rather na"ive approach followed here, and thus lies outside the scope of this note. We thus leave this result as a conjecture.

Nadaraya-Watson kernel smoothing as a random energy model (2408.03769 - Zavatone-Veth et al., 7 Aug 2024) in Section “Mean-squared generalization error”