Quantum Kernels and RKHS

Updated 12 February 2026

Quantum Kernels and RKHS are frameworks that embed classical data into quantum Hilbert spaces using inner-product based kernels for efficient representation.
They leverage measurement protocols like the SWAP test to estimate positive-definite kernels, enabling convex optimization in quantum machine learning.
RKHS properties such as Mercer decomposition and the representer theorem ensure efficient learning while elucidating conditions for achieving quantum advantage.

Quantum kernels are positive-definite functions derived from inner products of quantum states as a means for embedding classical data into high-dimensional Hilbert spaces. The theory and practice of quantum kernels are intrinsically linked to the framework of reproducing kernel Hilbert spaces (RKHS), which provide the mathematical and statistical infrastructure for kernel-based inference. In quantum machine learning, quantum feature maps, quantum kernels, and their associated RKHSs facilitate convex optimization methods such as support vector machines (SVMs), enable analysis of the generalization properties, and illuminate regimes of possible quantum advantage and their limitations.

1. Quantum Feature Maps and Quantum Kernels

A quantum feature map is constructed as a classical-to-quantum encoding: for classical input $x\in\mathcal X$ and a quantum device (n-qubit or bosonic), a unitary circuit $U(x)$ prepares a quantum state $|\psi(x)\rangle = U(x)|0^n\rangle$ or mixed state $\rho(x)=U(x)\rho_0 U(x)^\dagger$ (Schuld, 2021, Kübler et al., 2021, Bossens et al., 2024, Wood et al., 2024).

The quantum kernel is defined using the Hilbert--Schmidt inner product:

$K(x, x') = \mathrm{tr}[\,\rho(x)\,\rho(x')\,] = |\langle\psi(x)|\psi(x')\rangle|^2.$

For pure state encodings, this reduces to the squared modulus of the quantum state overlap. In continuous-variable (CV) platforms (e.g., Kerr machines, coherent-state kernels), the kernel takes the form $K(x, x') = \langle\psi(x)|\psi(x')\rangle$ or its modulus squared, and can often be related to classical RBF/Gaussian or more exotic nonclassical kernels (Wood et al., 2024, Chatterjee et al., 2016).

These quantum feature maps allow for the construction of positive-definite kernels that can be accessed via physical quantum measurement, such as the SWAP test, interferometric protocols, or bosonic parity measurements (Schuld, 2021, Chatterjee et al., 2016).

2. Reproducing Kernel Hilbert Spaces: Mathematical Foundations

Given any positive-definite kernel $K$ , one constructs an associated RKHS $\mathcal H_K$ as the unique Hilbert space of functions on $\mathcal X$ where $K$ enjoys the reproducing property (Manton et al., 2014, Aubin-Frankowski, 2018):

$\forall f\in \mathcal H_K,\; \forall x\in \mathcal X,\quad f(x) = \langle K(x,\cdot), f \rangle_{\mathcal H_K}.$

This follows from the Moore--Aronszajn theorem, which guarantees that every such kernel specifies a unique RKHS. If $K$ arises from a quantum encoding, the corresponding RKHS consists of functions built from quantum measurements/outcome probabilities.

Key structural results:

Representer theorem: Solutions to Tikhonov-regularized risk minimization in $\mathcal H_K$ take the finite sum form $f(x) = \sum_{i=1}^N \alpha_i K(x_i, x)$ . This reduces otherwise infinite-dimensional learning to convex optimization over the sample kernel Gram matrix (Schuld, 2021, Kübler et al., 2021, Manton et al., 2014).
Mercer decomposition: For continuous, positive-definite $K$ on a compact domain, there exists an eigen-expansion,

$K(x, x') = \sum_{i=1}^\infty \lambda_i \psi_i(x) \psi_i(x')$

with $(\psi_i)$ orthonormal in $L^2(\mathcal X)$ , encoding the "effective dimension" and generalization potential of $K$ (Kübler et al., 2021, Manton et al., 2014).

3. Quantum Kernels, RKHS Structure, and Learning

Quantum kernels embed classical data into high (often exponential)-dimensional quantum Hilbert spaces. The corresponding RKHS is generally a space of functions $f(x) = \operatorname{tr}[\rho(x)M]$ , where $M$ is a Hermitian measurement operator, and these functions can be interpreted as quantum models parameterized by $M$ in the Hilbert--Schmidt space (Schuld, 2021).

Quantum kernel SVMs are mathematically identical—apart from the origin of the kernel—to classical kernel SVMs: the solution for a margin-maximizing classifier is expressible via dual variables and the quantum kernel Gram matrix $[K(x_i,x_j)]_{i,j}$ . All classical RKHS theory, including the representer theorem and the convexity of the optimization, carries over directly (Schuld, 2021, Manton et al., 2014).

Importantly, quantum kernel-based learning is always convex once the kernel is estimated, in contrast to variational quantum circuit learning, which is non-convex and prone to local optima and barren plateaus. The RKHS framework ensures that the kernel-based optimum is globally optimal within the span of all available quantum features; any variational measurement ansatz that does not span the span $\{ \rho(x_i)\}$ is suboptimal (Schuld, 2021).

4. RKHS Spectral Properties, Generalization, and Quantum Advantage

The spectral properties of the kernel integral operator $K$ are central to the generalization properties of quantum kernel methods. Mercer decomposition yields eigenvalues $\{\lambda_i\}$ , and the effective dimension $d_{\mathrm{eff}}(\epsilon) = \sum_i \lambda_i/(\lambda_i + \epsilon)$ quantifies learnability with finite data (Kübler et al., 2021).

Quantum kernels induced by product-state encodings in large-qubit Hilbert spaces generically lead to RKHSs of extremely high (often exponential) dimension, with most eigenvalues exponentially small. In such regimes, generalization from polynomially many samples is impossible, as nearly all nontrivial functions are essentially orthogonal and unlearnable with feasible data (Kübler et al., 2021).

Quantum advantage through kernels therefore depends crucially on engineering a quantum kernel whose RKHS is both low-dimensional (so generalization is feasible) and contains functions that are hard to simulate classically. Projected or "biased" quantum kernels—where reduced density matrices over a small subsystem are compared—can engineer such low-dimensional, classically intractable function spaces, enabling an inductive bias inaccessible by classical kernels. However, the measurement overhead for estimating these kernels can become prohibitive, especially in near-term devices.

A quantum learning advantage emerges only when:

The RKHS is low-dimensional.
The function space contains classically intractable functions.
The kernel matrix can be estimated efficiently on hardware (Kübler et al., 2021).

5. Quantum Kernel Estimation and Implementation

Estimating quantum kernels on hardware is typically performed via measurement protocols such as the SWAP test (for qubits) or optical interference and photon counting (for CV devices). Each kernel entry $K(x, x')$ is estimated by repeated, independent measurements, with variance $O(1/M)$ for $M$ repetitions, so achieving precision $\epsilon$ requires $O(\epsilon^{-2})$ shots per entry (Schuld, 2021, Chatterjee et al., 2016, Wood et al., 2024).

For $M$ data points, constructing the full quantum kernel Gram matrix requires $O(M^2 \epsilon^{-2})$ circuit executions. For "biased" projected kernels, measurement complexity can be exponential in subsystem size due to state tomography requirements, creating a bottleneck in practice for obtaining a quantum advantage (Kübler et al., 2021).

The table below summarizes kernel estimation protocols and complexity:

Protocol type	Physical platform	Kernel estimation scaling
SWAP test (qubits)	Gate-based quantum	$O(M^2\epsilon^{-2})$
Parity measurement (bosons)	CV/Kerr hardware	$O(M^2\epsilon^{-2})$
Optical POVM (coherent states)	Quantum optics	$O(M^2M')$ ( $M'$ shots/prob)

6. Specialized Quantum Kernels and RKHSs

Quantum kernels can be engineered to encode various required symmetries or function classes, leading to specialized RKHS structures:

Symmetric/Antisymmetric kernels: By symmetrizing or antisymmetrizing base kernels over the permutation group, one directly constructs RKHSs of symmetric (bosonic) or antisymmetric (fermionic) functions, critical in quantum chemistry and physics. For Gaussian kernels, the RKHS remains universal (dense in the space of symmetric/antisymmetric continuous functions), while for polynomial kernels, the feature-space dimension is dramatically reduced, reflecting the physical symmetry constraints. The antisymmetric Gaussian kernel can be written as a Slater determinant for efficient evaluation (Klus et al., 2021).
Noncommutative quantum spaces: Paragrassmann algebras provide examples of quantum (noncommutative) spaces where subalgebras form finite-dimensional RKHSs, even though evaluation at a "point" is replaced with noncommutative dualities and Berezin integration. RKHS theory extends, with careful tracking of positivity and the reproducing property (Sontz, 2012).
Quantum propagators as kernels: The Feynman propagator, for both single- and multi-particle quantum systems, functions as a reproducing kernel, inducing an RKHS where quantum amplitudes and Fock space construction become transparent in kernel-theoretic terms (Aubin-Frankowski, 2018).
Continuous-variable and coherent-state kernels: Canonical and generalized coherent states give rise to kernels of the form $K(\alpha, \beta) = \langle \alpha | \beta \rangle$ or their modulus squared, which correspond to RBF/Gaussian (for canonical states) or more exotic radial kernels (for SU(1,1) or Pöschl–Teller states). Overlap measured via quantum optical POVM enables fast kernel estimation in high dimensions; quantum optical schemes may allow efficient implementations difficult to mimic classically (Chatterjee et al., 2016, Wood et al., 2024).

7. Extensions: Quantum Kernels in Reinforcement Learning and Operator-Valued RKHSs

Quantum kernel methods extend beyond supervised learning to reinforcement learning (RL) via policy-gradient and actor-critic frameworks in quantum environments (Bossens et al., 2024). In these settings:

Quantum policies are constructed as representer expansions in the quantum kernel RKHS.
Both analytic and numerical quantum gradients benefit from the closed-form expressions enabled by RKHS theory, avoiding costly parameter-shift rules or numerical approximations.
Nonparametric Gaussian quantum kernel policies support vector-valued actions, with the RKHS structure naturally extending to operator-valued (vector-valued) settings.
Sample-complexity analysis reveals quadratic reductions in queries to the quantum environment compared to classical RL, under idealized assumptions of quantum oracle access.

Despite the architectural benefits and sample complexity reductions, practical use is limited by oracle assumptions, circuit depth for state preparation, and computational overheads for large Gram matrices in high-dimensional spaces (Bossens et al., 2024).

In summary, quantum kernels and their induced reproducing kernel Hilbert spaces constitute the mathematical core of quantum kernel-based learning. They preserve all foundational structure of classical RKHS theory while leveraging quantum encodings to potentially access function spaces or symmetries that are inaccessible in classical settings. Genuine quantum advantage depends on a conjunction of careful kernel engineering (to ensure both statistical learnability and classical intractability) and the ability to estimate kernel entries efficiently with available quantum devices. The RKHS picture further unifies supervised and reinforcement learning, classical and quantum, under a common mathematical lens.