Non-Stationary Isotropic Kernels

Updated 28 July 2025

Non-stationary isotropic kernels are covariance functions on Euclidean space that combine rotational invariance with location-dependent flexibility.
They generalize stationary, dot product, and infinite-width neural network kernels by capturing even and odd geometric modes via radial functions and Gegenbauer polynomials.
Their design supports practical applications in Gaussian process regression, spatial statistics, and deep learning while ensuring strict positive definiteness through careful function construction.

Non-stationary isotropic kernels are covariance functions defined on Euclidean space $\mathbb{R}^d$ that are invariant under the action of the orthogonal group $O(d)$ but whose dependence on location is not restricted to stationary differences (i.e., they are not translation invariant). Such kernels combine the geometric invariance of isotropy—covariance functions that depend only on the “shape” of pairs $(x, y)$ modulo rotations—with the modeling flexibility of non-stationarity, permitting spatially heterogeneous scaling, amplitude, and regularity. This framework encompasses and unifies classical stationary isotropic kernels, dot product kernels, and their non-stationary generalizations, and includes prominent machine learning objects such as infinite-width neural network kernels.

1. Mathematical Characterization of Non-Stationary Isotropic Kernels

Let $K: \mathbb{R}^d \times \mathbb{R}^d \to \mathbb{R}$ be a continuous kernel satisfying $K(x, y) = K(Ux, Uy)$ for all $U \in O(d)$ . The principal result is that any such kernel admits an expansion of the form

$K(x, y) = \sum_{n=0}^{\infty} \alpha_n^{(d)}(\|x\|, \|y\|) \, \widetilde{P}_n(\langle x/\|x\|,\, y/\|y\|\rangle),$

where

$\alpha_n^{(d)}: [0, \infty) \times [0, \infty) \to \mathbb{R}$ are continuous “radial” kernels, uniquely determined by $K$ , and must be such that $\sum_n \alpha_n^{(d)}(r, r) < \infty$ for all $r$ ,
$\widetilde{P}_n$ is the degree- $n$ normalized Gegenbauer polynomial with index $\lambda = (d-2)/2$ .

This structure generalizes Schoenberg’s classical result for inner-product–dependent kernels on the sphere (where the $\alpha_n^{(d)}$ reduce to constants). When $K$ is stationary isotropic, the dependence on $\|x\|,\|y\|$ becomes a function of $r = \|x-y\|$ , while restriction to the unit sphere ( $\|x\| = \|y\|=1$ ) recovers dot product kernels (Benning et al., 27 Jun 2025).

2. Strict Positive Definiteness

For $K$ as above to be strictly positive definite, two requirements must be met:

The “base” kernel must be positive at the origin: $\alpha_0^{(d)}(0,0) > 0$ .
For any finite set $\{r_1, ..., r_m\}$ of distinct norm values, and any nonzero $c \in \mathbb{R}^m$ , the quadratic forms $c^\top [\alpha_n^{(d)}(r_i, r_j)]_{i,j=1}^m c$ are nonzero for infinitely many even and infinitely many odd $n$ .

This ensures that the kernel captures a sufficiently rich set of geometric “modes” (even and odd degree components) on any set of distinct radial shells. Specializing to dot product kernels (with $\alpha_n^{(d)}(r, s) = a_n (rs)^n$ ), one recovers the classical requirement that infinitely many even and infinitely many odd $a_n$ coefficients must be positive for strict positive definiteness (Benning et al., 27 Jun 2025).

3. Unification of Stationary, Dot Product, and Neural Network Kernels

The general expansion simultaneously captures:

Stationary Isotropic Kernels: If $K(x, y) = h(\|x-y\|)$ , then $\alpha_n^{(d)}(\|x\|, \|y\|)$ simplifies to a function of $\|x-y\|$ , and the kernel depends on $x$ and $y$ only via their separation.
Dot Product Kernels: When restricted to $\|x\| = \|y\| = 1$ , the expansion becomes $K(x, y) = \sum_n a_n\, \langle x, y\rangle^n$ (Schoenberg’s sphere theorem).
Infinite-Width Neural Network Kernels: For many architectures (e.g., NNGP, NTK), the covariance kernel between network outputs is of the form

$K(x, y) = \sum_{m=0}^{\infty} \alpha_m(\|x\|, \|y\|)\langle x/\|x\|,\, y/\|y\|\rangle^m,$

where $\alpha_m$ are recursively computable functions derived from the activation and initialization in the neural network. These kernels are non-stationary because they depend separately on $\|x\|$ and $\|y\|$ , but isotropic because the only geometric dependence is via the normalized inner product (Benning et al., 27 Jun 2025).

4. Connection to Spatial Statistics and Machine Learning

This unification brings fundamental insight into the design and analysis of kernels in several domains:

Gaussian Process Regression and Kernel Methods: The expansion gives a systematic template for constructing kernels able to express locally adaptive variance and geometry, beyond the restrictions of translation invariance.
Spatial Statistics: Non-stationary isotropic kernels are essential for modeling spatially heterogeneous media, for example where correlation length or amplitude varies with location, but where rotational symmetry is natural.
Neural Architecture Analysis: The realization that wide neural networks induce non-stationary isotropic kernels links deep learning theory with kernel methods, enabling cross-fertilization via harmonic analysis and classical positive-definiteness criteria.

5. High-Dimensional and Infinite-Dimensional Regime

The characterization is stable as $d \to \infty$ , since Gegenbauer polynomials asymptotically become monomials and the series expansion remains well-defined. This property is critical for high-dimensional function analysis and for understanding the geometry of function spaces associated with neural tangent kernels (NTK) and related GP priors (Benning et al., 27 Jun 2025).

6. Practical Guidance for Construction and Use

The expansion $K(x, y) = \sum_n \alpha_n^{(d)}(\|x\|, \|y\|)\, \widetilde{P}_n(\langle x/\|x\|, y/\|y\|\rangle)$ provides a concrete recipe:

One designs or learns the functions $\alpha_n^{(d)}(\cdot,\cdot)$ to encode domain knowledge or fit data, for example adapting to spatial inhomogeneity, input-dependent scaling, or to capture effects of deep learning feature spaces.
For strict positive definiteness, care must be taken to ensure that both even and odd degree sectors are nontrivial across all radial slices under consideration.

In summary, the Schoenberg-type characterization synthesizes stationary, dot product, and non-stationary isotropic kernel classes into a single positive-definite framework. It enables principled generalizations for spatial statistics, kernel learning, and high-dimensional inference, and precisely delineates the interplay between rotational invariance and local adaptability in non-stationary covariance modeling (Benning et al., 27 Jun 2025).

PDF Markdown Chat (Pro)

References (1)

Schoenberg characterization of continuous non-stationary isotropic positive definite kernels (2025)

Follow Topic

Get notified by email when new papers are published related to Non-Stationary Isotropic Kernels.