Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regularized Geometric Jensen-Shannon Divergence

Updated 30 June 2025
  • Regularized Geometric Jensen-Shannon Divergence is a robust info-theoretic measure that extends the classical Jensen-Shannon divergence by using geometric means and regularization to ensure finiteness.
  • It leverages the geometric interpolation of Gaussian measures and regularized log-determinant divergences to overcome limitations in infinite-dimensional Hilbert spaces.
  • The divergence has practical applications in functional data analysis, quantum information, and machine learning by providing closed-form, metrically robust comparisons between complex probability distributions.

The Regularized Geometric Jensen-Shannon Divergence (RGJSD) is an information-theoretic measure that generalizes the Jensen-Shannon divergence by incorporating both geometric means and explicit regularization, designed to compare probability distributions—especially Gaussian measures—in infinite-dimensional Hilbert spaces and other complex settings. RGJSD addresses the limitations of the classical JS divergence by providing closed-form, well-defined, and metrically robust distances for a wide variety of domains, including functional data analysis, quantum information, and kernel-based machine learning.

1. Foundations and Motivation

RGJSD arises from the need to extend the Jensen-Shannon divergence beyond its classic definition, which involves arithmetic mixtures and is typically tractable only in finite dimensions or for absolutely continuous measures. For Gaussian measures, especially on infinite-dimensional Hilbert spaces (e.g., Gaussian processes, random fields), the classical JS divergence becomes ill-defined due to the mutual singularity of the measures or the divergence of log-determinant terms. The RGJSD replaces the arithmetic mixture with a geometric mean—aligning its construction with the geometry and information structure of exponential families—and utilizes regularization to ensure finiteness and general applicability.

This divergence builds on two major ingredients:

  • Geometric means of measures: For instance, the geometric mean between two Gaussians is itself a Gaussian with interpolated parameters.
  • Log-determinant divergences: Extending the log-determinant (matrix determinant) to infinite dimensions using trace-class operator theory, allowing divergence calculation for broad classes of Gaussian measures.

2. Mathematical Formulation

Let HH be a real separable Hilbert space, and let μ0=N(m0,C0)\mu_0 = N(m_0, C_0), μ1=N(m1,C1)\mu_1 = N(m_1, C_1) be Gaussian measures on HH. The geometric mean (interpolation) of their parameters is defined as:

  • Cα,γ=[(1α)(C0+γI)1+α(C1+γI)1]1C_{\alpha,\gamma} = \left[ (1-\alpha)(C_0+\gamma I)^{-1} + \alpha (C_1+\gamma I)^{-1} \right]^{-1}
  • mα,γ=Cα,γ[(1α)(C0+γI)1m0+α(C1+γI)1m1]m_{\alpha,\gamma} = C_{\alpha,\gamma} \left[ (1-\alpha)(C_0 + \gamma I)^{-1} m_0 + \alpha (C_1 + \gamma I)^{-1} m_1 \right]

Here, α[0,1]\alpha \in [0,1] and γ>0\gamma > 0 is the regularization parameter.

The regularized geometric Jensen-Shannon divergence is then defined as: JSGαγ(μ0μ1)=1α2Cα,γ1/2(m0mα,γ)2 +α2Cα,γ1/2(m1mα,γ)2 +1α2dlogdet1(C0+γI,Cα,γ) +α2dlogdet1(C1+γI,Cα,γ)\begin{aligned} JS_{G_\alpha}^\gamma (\mu_0 \| \mu_1) =&\, \frac{1-\alpha}{2} \left\|C_{\alpha, \gamma}^{-1/2}\left(m_0-m_{\alpha, \gamma}\right)\right\|^2 \ &+\frac{\alpha}{2} \left\|C_{\alpha, \gamma}^{-1/2}\left(m_1-m_{\alpha, \gamma}\right)\right\|^2 \ &+\frac{1-\alpha}{2} d^1_{\log\det}\left(C_0 + \gamma I, C_{\alpha,\gamma}\right) \ &+\frac{\alpha}{2} d^1_{\log\det}\left(C_1 + \gamma I, C_{\alpha,\gamma}\right) \end{aligned} where dlogdet1(A,B)d^1_{\log\det}(A, B) is the (regularized) log-determinant divergence between positive definite unitized trace-class operators, defined as: dlogdet1(A,B)=trX(B1AI)logdetX(B1A)d^1_{\log\det}(A, B) = \operatorname{tr}_X(B^{-1}A - I) - \log\det_X(B^{-1}A) with trX\operatorname{tr}_X the extended trace and logdetX\log\det_X the infinite-dimensional determinant.

As γ0+\gamma \rightarrow 0^+, provided μ0\mu_0 and μ1\mu_1 are equivalent, this recovers the unregularized geometric Jensen-Shannon divergence.

In finite dimensions, this reduces to an explicit formula involving standard means, traces, and determinant terms: JSGα(μ0μ1)=1α2Cα1/2(m0mα)2+α2Cα1/2(m1mα)212logdet(C0)1αdet(C1)αdet(Cα)+12tr(Cα1[(1α)C0+αC1]I)JS_{G_\alpha}(\mu_0 \| \mu_1) = \frac{1-\alpha}{2} \| C_\alpha^{-1/2}(m_0 - m_\alpha) \|^2 + \frac{\alpha}{2} \| C_\alpha^{-1/2}(m_1 - m_\alpha) \|^2 - \frac{1}{2} \log \frac{\det(C_0)^{1-\alpha}\det(C_1)^{\alpha}}{\det(C_\alpha)} + \frac{1}{2} \operatorname{tr}\left( C_\alpha^{-1}[(1-\alpha)C_0 + \alpha C_1] - I \right) with Cα=[(1α)C01+αC11]1C_\alpha = [(1-\alpha)C_0^{-1} + \alpha C_1^{-1}]^{-1}, mαm_\alpha analogous.

3. Role of Regularization and Infinite-Dimensional Behavior

In infinite-dimensional Hilbert spaces, absolute continuity (equivalence) of Gaussian measures is rare, so many standard divergences diverge or lack meaning. The RGJSD addresses this by regularizing each covariance with γI\gamma I, ensuring all linear operators involved are of the correct class and well-defined. The regularization parameter γ\gamma thus:

  • Ensures the divergence is finite for all pairs of Gaussian measures,
  • Interpolates between a fully regularized (robust, smoothed) divergence at finite γ\gamma and the exact geometric JS divergence as γ0+\gamma \rightarrow 0^+ for equivalent measures,
  • Controls stability and sensitivity in practical or numerical applications.

The log-determinant terms employ extended (unitized) trace-class and Fredholm determinant theory, crucial for extending entropy-like divergences to the operator setting.

4. Applications and Theoretical Implications

RGJSD is widely applicable in problems involving Gaussian processes, random fields, and functional data:

  • Functional data analysis: Comparing (possibly infinite-dimensional) stochastic processes (e.g., comparing covariance structure of time series, spatial data, medical imaging signals).
  • Quantum information: Distance between quantum states (density operators) via symmetric, regularized, and metrically robust information geometry.
  • Machine learning: Kernel-based generative models and data comparison in high or infinite-dimensional feature spaces.
  • Bayesian inverse problems: Quantifying the information gain between priors and posteriors as Gaussian measures in infinite-dimensional settings.

The regularized nature of the divergence provides robustness against degeneracy and instability, with the guarantee that as regularization vanishes, classical finite-dimensional results are recovered.

RGJSD is part of a family of geometric Jensen-Shannon divergences, which generalize JS by:

  • Replacing the arithmetic mean in mixture distributions with a geometric mean, particularly well-suited for exponential family or Gaussian settings ((1904.04017); (2006.10599); (2506.10494)).
  • Allowing for closed-form expressions even when classical JS divergence has none, especially for Gaussians where the arithmetic mixture is not Gaussian but the geometric mixture is.

RGJSD differs from standard geometric JS or Jensen-Tsallis divergences by being explicitly regularized at the operator level, ensuring universal applicability regardless of equivalence or singularity.

6. Properties and Metric Structure

  • Symmetry: Like all Jensen-Shannon-type divergences, RGJSD is symmetric in its arguments.
  • Boundedness and Finiteness: Regularization ensures RGJSD is always finite and well-posed.
  • Metric property: For its square root, RGJSD inherits the metric property on positive definite operators under suitable analytic conditions, extending results known for classical and quantum JS divergences (1911.02643).
  • Infinite-dimensional generalization: RGJSD robustly extends divergence-based geometry to infinite dimensions, essential for modern applications in stochastic modeling.

7. Summary Table: Principal Formulas

Aspect Regularized Geometric JS Divergence
General form 1α2KL(μ0μα)+α2KL(μ1μα)\begin{aligned} \frac{1-\alpha}{2} KL(\mu_0 \| \mu_{\alpha}) + \frac{\alpha}{2} KL(\mu_1 \| \mu_{\alpha}) \end{aligned}
Regularized version (Hilbert) See full expression above; involves dlogdet1d^1_{\log\det}
Limiting case (γ0\gamma \to 0) Recovers classic geometric JS for equivalent measures
Symmetry Yes
Metric property (√divergence) Yes, under analytic conditions via operator theory

8. Conclusion

The Regularized Geometric Jensen-Shannon Divergence is a robust, symmetric, and theoretically grounded generalization of the Jensen-Shannon divergence, designed for comparison of Gaussian measures on both finite and infinite-dimensional spaces. It overcomes the inapplicability of classical density-based divergences in infinite dimensions by geometrically interpolating via the operator inverse and regularizing the underlying covariance structure, yielding a finite, practical, and metrically valid formula that reduces to the classical case when possible. This construction has significant implications for geometric information theory, quantum information, functional data analysis, and operator-based kernel learning.