Finite-size correction δ in the NW estimator’s normalized mean-squared generalization error
Determine the correction term δ(α, β) that captures finite-size effects in the normalized mean-squared generalization error of the Nadaraya–Watson kernel smoothing estimator with spherical data, radial basis kernel k(x, x') = exp(β⟨x, x'⟩), and single-index targets f(x) = g(⟨w, x⟩/d) where g is positive-homogeneous of degree k, in the regime n = exp(α d) as d → ∞. Specifically, establish an explicit formula or rigorous bounds for δ(α, β) in the relation E_{D} E_{x}[(f(x) − f_{D}(x))^2] / E_{x}[f(x)^2] ∼ (1 − r_{*}(α, β)^k)^2 + δ(α, β), where r_{*}(α, β) maximizes φ(r) = α + β r + (1/2) log(1 − r^2) over r ∈ [0, sqrt(1 − e^{−2α})], yielding r_{*} = (sqrt(1 + 4β^2) − 1)/(2β) for β ≤ e^{2α} sqrt(1 − e^{−2α}) and r_{*} = sqrt(1 − e^{−2α}) otherwise.
References
Accurately determining δ requires a more careful analysis than the rather na"ive approach followed here, and thus lies outside the scope of this note. We thus leave this result as a conjecture.