Papers
Topics
Authors
Recent
Search
2000 character limit reached

Quantum Neural Tangent Kernel-UCB

Updated 7 January 2026
  • QNTK-UCB is a kernel-based online learning algorithm that uses quantum circuits to efficiently tackle sequential decision-making with sublinear regret.
  • The method leverages a static Quantum Neural Tangent Kernel, benefiting from rapid eigenvalue decay for implicit regularization and improved parameter efficiency.
  • Empirical results show that QNTK-UCB outperforms classical baselines in sample complexity and exploration efficiency on quantum-native decision tasks.

The Quantum Neural Tangent Kernel-Upper Confidence Bound (QNTK-UCB) is a kernel-based online learning algorithm designed for sequential decision-making in bandit and Bayesian optimization settings, leveraging parameterized quantum circuits. QNTK-UCB harnesses the properties of the Quantum Neural Tangent Kernel (QNTK), a quantum analogue of the classical neural tangent kernel, to achieve improved parameter efficiency, stability, and inductive bias compared to both classical and standard quantum kernel methods. Its theoretical and empirical properties allow for sublinear regret in contextual bandits, with explicit advantages in sample complexity and effective dimensionality in regimes natural to quantum hardware (Shirai et al., 2021, Huang et al., 6 Jan 2026).

1. Definition and Construction of the Quantum Neural Tangent Kernel

The QNTK is derived by considering a parameterized quantum circuit (QNN) acting on mm qubits, with trainable parameters θRp\bm\theta\in\mathbb{R}^p, and classical input xXRdx\in\mathcal{X}\subset\mathbb{R}^d. The model computes

f(x;θ)=1N(m)k=1m0mU(θ,x)OkU(θ,x)0m,f(x; \bm\theta) = \frac{1}{N(m)}\sum_{k=1}^m \left\langle 0^m\left| U^\dagger(\bm\theta, x)\, \mathcal{O}_k\, U(\bm\theta, x)\right|0^m \right\rangle,

where O=k=1mOk\mathcal{O} = \sum_{k=1}^m \mathcal{O}_k is a sum of local observables, each Ok\mathcal{O}_k traceless with eigenvalues ±1\pm 1, and N(m)N(m) is for normalization. Training is initialized at random parameter θ0\bm\theta_0.

The QNTK arises from a first-order (“tangent”) expansion about this initialization. Defining the quantum feature map

ϕ(x)=1NK(m)θf(x;θ0)Rp,\phi(x) = \frac{1}{\sqrt{N_K(m)}}\, \nabla_{\bm\theta} f(x;\bm\theta_0) \in \mathbb{R}^p,

the empirical kernel is

K^(x,x)=ϕ(x)ϕ(x).\hat K(x,x') = \phi(x)^\top \phi(x').

As mm\to\infty, K^(x,x)\hat K(x,x') concentrates around a deterministic analytic kernel K(x,x)=Eθ0[K^(x,x)]K(x,x') = \mathbb{E}_{\bm\theta_0}[\hat K(x,x')] (Shirai et al., 2021, Huang et al., 6 Jan 2026).

2. Lazy Training and Kernel Concentration Regime

In the overparameterized (“lazy”) regime, the parameter updates during training remain small: θtθ0=O(T/p)1\|\bm\theta_t-\bm\theta_0\| = O(T/p) \ll 1 for TT steps and pp sufficiently large. This regime justifies the use of the tangent kernel: the kernel computed at initialization remains stable, ensuring that subsequent optimization can be replaced by static kernel methods (Shirai et al., 2021).

QNTK concentration is induced by selecting p=Ω~((TK)3)p=\widetilde{\Omega}((TK)^3), which guarantees convergence of the empirical kernel to its expectation within spectral norm. This is substantially more efficient than the classical NTK, where the required parameter count for similar guarantees is pclassical=Ω~((TK)8)p_{\rm classical} = \widetilde{\Omega}((TK)^8) (Huang et al., 6 Jan 2026).

3. QNTK-UCB Algorithmic Framework

QNTK-UCB leverages the static QNTK as a kernel for kernelized bandit/RL inference. At each round tt (having observed rewards rτr_{\tau} for τ<t\tau < t), it maintains:

Zt1=λI+τ=1t1ϕ(xτ,aτ)ϕ(xτ,aτ),bt1=τ=1t1rτϕ(xτ,aτ)Z_{t-1} = \lambda I + \sum_{\tau=1}^{t-1} \phi(x_{\tau,a_{\tau}})\phi(x_{\tau,a_{\tau}})^\top,\qquad b_{t-1} = \sum_{\tau=1}^{t-1} r_{\tau}\phi(x_{\tau,a_{\tau}})

The ridge-regression estimate is α^t1=Zt11bt1\hat\alpha_{t-1} = Z_{t-1}^{-1} b_{t-1}. For each action aa, the UCB is given by

UCBt,a=ϕ(xt,a)α^t1+βt1ϕ(xt,a)Zt11ϕ(xt,a)\mathrm{UCB}_{t,a} = \phi(x_{t,a})^\top \hat\alpha_{t-1} + \beta_{t-1} \sqrt{\phi(x_{t,a})^\top Z_{t-1}^{-1} \phi(x_{t,a})}

where the exploration parameter βt1\beta_{t-1} is set via martingale concentration bounds: βt1=νlogdetZt1det(λI)+2log1δ+λS\beta_{t-1} = \nu \sqrt{\log\frac{\det Z_{t-1}}{\det(\lambda I)} + 2\log\frac{1}{\delta}} + \sqrt{\lambda} S with ν\nu and SS controlling the sub-Gaussian noise and RKHS norm of the reward, respectively. The action maximizing UCBt,a\mathrm{UCB}_{t,a} is selected, reward is observed, and statistics are updated. The full procedure mirrors a kernelized linear UCB, but in the QNTK feature space (Huang et al., 6 Jan 2026).

4. Regret Analysis and Parameter Scaling

Under standard boundedness and realizability assumptions—the reward function h(x)h(x) lies in the RKHS of the QNTK—regret is bounded by

RT=O~(d~qT)R_T = \tilde O(\tilde d_q \sqrt{T})

where the quantum effective dimension is

d~q(λ)=logdet(I+Kˉ/λ)log(1+TK/λ),\tilde d_q(\lambda) = \frac{\log\det(I + \bar K /\lambda)}{\log(1 + TK/\lambda)},

with Kˉ\bar K the limiting QNTK Gram matrix. This effective dimension is controlled by the spectral decay of QNTK eigenvalues {λi}\{\lambda_i\}: γT(q)=ilog(1+λi/λ).\gamma_T^{(q)} = \sum_i \log(1+\lambda_i/\lambda). Sharper decay—common in QNTK compared to classical kernels—yields smaller γT(q)\gamma_T^{(q)} and thus lower regret (Huang et al., 6 Jan 2026).

In contrast, classical NeuralUCB requires parameter scaling pclassical=Ω~((TK)8)p_{\rm classical} = \tilde\Omega((TK)^8) to maintain NTK concentration, resulting in significantly higher complexity (Huang et al., 6 Jan 2026). Thus, QNTK-UCB realizes a parameter-efficient regime.

5. Spectral Properties and Implicit Regularization

QNTK's eigenvalues typically exhibit a more rapid decay than classical NTKs or RBF kernels, reflecting a strong spectral bias. This phenomenon, related to the barren plateau effect (gradient concentration near zero for deep quantum circuits), results in lower effective dimension d~q\tilde d_q and smaller information gain, reducing the exploration cost in bandit tasks (Shirai et al., 2021, Huang et al., 6 Jan 2026).

While a uniform eigenvalue shrinkage would impair the representational power, QNTK often concentrates its spectral mass on a low-rank subspace matched to quantum-native reward functions. For shallow entangling circuits, most variance is explained by a small number of leading eigenvalues: λ1λ2,i>rλi small for moderate r.\lambda_1 \gg \lambda_2 \gg \cdots,\quad \sum_{i>r} \lambda_i \text{ small for moderate } r. This sharp concentration acts as an implicit regularizer—reducing noise propagation and enabling efficient exploration without loss of alignment to the relevant signal subspace (Huang et al., 6 Jan 2026).

6. Practical Implementation and Evaluation

Key practical insights include:

  • Kernel evaluation cost: QNTK-UCB requires O(p)O(p) circuit evaluations per Gram matrix entry via parameter-shift for each θf(x;θ0)\nabla_\theta f(x;\bm\theta_0); Gram matrix assembly is O(n2p)O(n^2 p) for nn data points.
  • Parameter scaling: For feature dimension p=O(Lm2)p=O(L m^2) with two-qubit entanglers (layer count LL, qubits mm), QNTK scales efficiently up to the thresholds required for kernel concentration.
  • Noise robustness: Since parameter training is not performed on device, QNTK-UCB isolates inference from device-induced noise; any noisy gradient estimation only affects the GP posterior variance σn2\sigma_n^2, which is handled by Bayesian updates (Shirai et al., 2021).

Empirical assessments validate these properties:

  • On synthetic Gaussian-quantile classification bandits, QNTK-UCB achieves lower regret than NeuralUCB, NTK-UCB, and RBF-UCB at equal parameter counts, particularly in low-data and overparameterized regimes.
  • On quantum-native tasks, such as variational quantum eigensolver (VQE) recommender bandits, QNTK-UCB outperforms classical baselines, capturing Hilbert-space correlations inaccessible to classical kernels.
  • Empirically, QNTK’s effective dimension saturates or decreases with increasing mm, while classical NTK dimensions diverge, confirming superior capacity control (Huang et al., 6 Jan 2026).

7. Quantum Advantage, Open Directions, and Limitations

QNTK-UCB realizes quantum advantage by matching quantum-native inductive biases with effective parameter counts p=O~((TK)3)p=\tilde{O}((TK)^3) and leveraging spectral bias to reduce required exploration. Freezing the QNN at initialization bypasses barren-plateau training pathologies, enabling scalable and stable inference with provable regret guarantees in online settings (Huang et al., 6 Jan 2026).

Open problems and future directions include:

  • Characterizing function classes and quantum-circuit tasks for which QNTK-UCB exceeds any classical kernel.
  • Exploring hybrid quantum–classical architectures that allow limited trainability for tasks with greater reward function complexity.
  • Investigating deeper and more non-local circuits to amplify quantum advantage, balanced by kernel concentration trade-offs.
  • Formalizing the precise relationship between circuit architecture, spectral bias, and achievable effective dimension.

QNTK-UCB establishes a blueprint for exploiting quantum inductive bias in online learning, substantially lowering sample and parameter complexity in low-data and quantum-native regimes (Shirai et al., 2021, Huang et al., 6 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Quantum Neural Tangent Kernel-Upper Confidence Bound (QNTK-UCB).