Papers
Topics
Authors
Recent
Search
2000 character limit reached

Operator-based KKL (Quantum KL Divergence)

Updated 8 March 2026
  • Operator-based KKL is a divergence measure for density operators and kernel embeddings that extends classical KL divergence using operator convexity.
  • It employs variational duality and supremum representations to robustly quantify discrepancies in quantum state distinguishability and nonparametric statistics.
  • The framework underpins applications in quantum hypothesis testing, resource theory protocols, and information geometry with efficient quantum algorithmic implementations.

Operator-based Kullback–Leibler (Quantum Kullback–Leibler) divergence generalizes the classical KL measure of discrepancy between probability distributions to settings where objects of interest are operators—particularly density operators in quantum theory and positive definite operators arising from kernel embeddings. Two distinguished and deeply interrelated families arise: the quantum relative entropy for density matrices, including the maximal (Belavkin–Staszewski) variant, and the kernel KL (KKL) divergence on operator embeddings. Both forms exploit operator convexity, variational duality, and provide foundational distances in quantum information, nonparametric statistics, and information geometry.

1. Foundational Definitions and Operator-based Formulations

For quantum states (density operators) ρ,σρ,σ on finite-dimensional Hilbert space, the canonical operator-based KL is the quantum relative entropy

S(ρσ)=Tr[ρ(logρlogσ)].S(ρ‖σ) = \operatorname{Tr}\bigl[ρ(\logρ - \logσ)\bigr].

This reduces to the classical KL divergence when ρ,σρ,σ commute, and quantifies the distinguishability of quantum states, operationally characterized by the error exponents in quantum hypothesis testing and rates in resource theory protocols (Felice et al., 2019, Matsumoto, 2013, Lu et al., 13 Jan 2025).

A maximal quantum ff-divergence is defined as

Dfmax(ρσ)=infreverse tests (T,p,q)Df(pq),D_f^{\max}(ρ‖σ) = \inf_{\text{reverse tests }(T,p,q)} D_f(p‖q),

where the infimum is over all reverse tests—CPTP maps TT and classical distributions p,qp,q such that T(p)=ρ,T(q)=σT(p)=ρ,\,T(q)=σ (Matsumoto, 2013). For f(λ)=λlogλf(λ) = λ\logλ, this recovers the operator-based Kullback–Leibler divergence

DKLmax(ρσ)=Trρlog(σ1/2ρσ1/2),D_{KL}^{\max}(ρ‖σ) = \operatorname{Tr}\,\rho\,\log(\sigma^{-1/2}\rho\sigma^{-1/2}),

also known as the Belavkin–Staszewski entropy SBS(ρσ)S_{BS}(ρ‖σ) (Ortigueira et al., 28 Nov 2025).

For kernel embeddings of distributions P,QP,Q via covariance operators CP,CQC_P,C_Q in some RKHS H\mathcal{H}, the kernel KL divergence is

DKKL(PQ)=tr[CPlogCPCPlogCQ].D_{KKL}(P\|Q) = \operatorname{tr}[C_P \log C_P - C_P\log C_Q].

This is structurally parallel to quantum relative entropy but applied to covariance operators of probability measures (Chazal et al., 2024).

2. Variational and Supremum Representations

Operator KL divergences admit variational dual forms. For DfmaxD_f^{\max}, there is a supremum over pairs of Hermitian operators (W1,W2)(W_1, W_2) determined by the operator convex constraint

rW1+W2f(r),r0r W_1 + W_2 \leq f(r), \quad \forall r \geq 0

such that

Dfmax(ρσ)=sup(W1,W2)Wmax(H){Tr[ρW1]+Tr[σW2]}D_f^{\max}(ρ‖σ) = \sup_{(W_1,W_2) \in W_{\max}(H)} \{\operatorname{Tr}[ρW_1] + \operatorname{Tr}[σW_2]\}

with f(λ)=λlogλf(λ)=λ\logλ for the KL case (Matsumoto, 2013). For quantum ff-divergence estimation on hardware, this variational structure is essential: one reduces log−\log to a quadrature over simple ftf_t-divergences, each admitting a variational form whose minima correspond to polynomial operator expectations implementable on NISQ devices (Lu et al., 13 Jan 2025).

3. Key Properties and Comparisons

Operator-based KL divergences possess a suite of crucial properties:

Property Petz/Umegaki S(ρσ)S(ρ‖σ) Maximal DKLmax(ρσ)D^{\max}_{KL}(ρ‖σ)
Data-processing Yes Yes
Joint convexity Yes Yes
Equality on commuting Yes Yes
Additive on tensors Yes Yes
Monotonicity Yes Yes
Lower semicontinuity Yes Yes
Potential negativity No Yes (for σσ pure)

DKLmax(ρσ)S(ρσ)D_{KL}^{\max}(ρ‖σ)\ge S(ρ‖σ), with equality iff [ρ,σ]=0[ρ,σ]=0. For σ=ψψσ=|\psi\rangle\langle\psi|, DKLmax(ρσ)D_{KL}^{\max}(ρ‖σ) can be negative and is given by ψρψlnψρψ\langle\psi|ρ|\psi\rangle\ln\langle\psi|ρ|\psi\rangle (Matsumoto, 2013, Ortigueira et al., 28 Nov 2025).

4. Connections to Classical KL and Ensemble Realizations

Belavkin–Staszewski entropy arises as the minimal KL divergence over all classical ensembles (unravelings) that realize ρρ and σσ. If both are diagonal in a (possibly non-orthogonal) common basis ψi|\psi_i\rangle, with ρ=ipiψiψiρ = \sum_i p_i |\psi_i\rangle\langle\psi_i|, σ=iqiψiψiσ = \sum_i q_i |\psi_i\rangle\langle\psi_i|,

SBS(ρσ)=ipilog(pi/qi)=DKL(μCBνCB)S_{BS}(ρ‖σ) = \sum_i p_i \log(p_i/q_i) = D_{KL}(μ_{CB}‖ν_{CB})

where μCB,νCBμ_{CB},ν_{CB} are atomic measures on ψi|\psi_i\rangle. This identification relates operator-based quantum divergences to classical measure-theoretic KL on the space of pure states and underpins large-deviation theory in quantum ensembles (Ortigueira et al., 28 Nov 2025).

5. Operator-based KL in RKHS: Kernel Kullback–Leibler (KKL) Divergence

The KKL extends operator KL to kernel embeddings: DKKL(PQ)=tr[CPlogCPCPlogCQ]D_{KKL}(P\|Q) = \operatorname{tr}[C_P \log C_P - C_P\log C_Q] with CP=φ(x)φ(x)dP(x)C_P = \int φ(x)\otimes φ(x)\,dP(x) in RKHS. DKKLD_{KKL} interpolates between classical KL and smoothed KL; it can be lower-bounded by kernel-smoothed KL. Notably, the unregularized KKL may not be defined if supports are disjoint; introducing regularization or “skew” variants ensures well-definedness: DKKL,λ(PQ)=tr[CP(CQ+λI)1]logdet[CP(CQ+λI)1](nPnQ)D_{KKL,λ}(P\|Q) = \operatorname{tr}[C_P(C_Q+λI)^{-1}] - \log\det[C_P(C_Q+λI)^{-1}] - (\text{n}_P - \text{n}_Q) which is always finite for full-rank CQ+λIC_Q+λI (Chazal et al., 2024).

6. Algorithmic Estimation on Quantum Hardware

Quantum algorithms for operator-based KL estimation (as in (Lu et al., 13 Jan 2025)) proceed by

  • decomposing log-\log via high-accuracy quadrature into ftf_t divergences,
  • representing variational minima via parameterized Hermitian polynomials,
  • estimating trace functionals on quantum circuits using “extended SWAP-test” schemes,
  • assembling the final result through classical optimization.

This approach enables estimation using at most $2n+1$ qubits (for nn-qubit inputs), distributed evaluation across hardware, and yields efficient O(poly(n))O(\text{poly}(n)) scaling. Error rates can be directly controlled by quadrature nodes and optimization precision.

7. Geometric and Information-theoretic Interpretation

Quantum KL functions as the canonical divergence in the information geometry of density operators. On the manifold of quantum states endowed with the quantum Fisher metric, the operator-based divergence is the matrix Bregman divergence associated to free energy. In the kernel/RKHS setting, KKL inherits strict convexity and Bregman structure, enabling Wasserstein gradient flows with properties analogous to standard KL-based flows but with superior support-mismatch sensitivity compared to first-moment metrics like MMD. As a measure of complexity and many-body correlation, quantum KL encapsulates the divergence from exponential (Gibbs) families, aligning with projections in exponential families and providing a fundamental information-geometric measure for both quantum and classical systems (Felice et al., 2019, Chazal et al., 2024).


The theory of operator-based Kullback–Leibler divergence establishes a unified perspective on quantum information metrics across quantum physics and nonparametric statistics, centering on the variational, geometric, analytic, and algorithmic properties of operator KL measures. Its maximal and regularized variants grant operational flexibility and foundational robustness in quantum information tasks and modern machine learning with operator-valued data.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Operator-based KKL (Quantum Kullback–Leibler).