Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

144 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

46 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Regularized Geometric Jensen-Shannon Divergence

Updated 30 June 2025

Regularized Geometric Jensen-Shannon Divergence is a robust info-theoretic measure that extends the classical Jensen-Shannon divergence by using geometric means and regularization to ensure finiteness.
It leverages the geometric interpolation of Gaussian measures and regularized log-determinant divergences to overcome limitations in infinite-dimensional Hilbert spaces.
The divergence has practical applications in functional data analysis, quantum information, and machine learning by providing closed-form, metrically robust comparisons between complex probability distributions.

The Regularized Geometric Jensen-Shannon Divergence (RGJSD) is an information-theoretic measure that generalizes the Jensen-Shannon divergence by incorporating both geometric means and explicit regularization, designed to compare probability distributions—especially Gaussian measures—in infinite-dimensional Hilbert spaces and other complex settings. RGJSD addresses the limitations of the classical JS divergence by providing closed-form, well-defined, and metrically robust distances for a wide variety of domains, including functional data analysis, quantum information, and kernel-based machine learning.

1. Foundations and Motivation

RGJSD arises from the need to extend the Jensen-Shannon divergence beyond its classic definition, which involves arithmetic mixtures and is typically tractable only in finite dimensions or for absolutely continuous measures. For Gaussian measures, especially on infinite-dimensional Hilbert spaces (e.g., Gaussian processes, random fields), the classical JS divergence becomes ill-defined due to the mutual singularity of the measures or the divergence of log-determinant terms. The RGJSD replaces the arithmetic mixture with a geometric mean—aligning its construction with the geometry and information structure of exponential families—and utilizes regularization to ensure finiteness and general applicability.

This divergence builds on two major ingredients:

Geometric means of measures: For instance, the geometric mean between two Gaussians is itself a Gaussian with interpolated parameters.
Log-determinant divergences: Extending the log-determinant (matrix determinant) to infinite dimensions using trace-class operator theory, allowing divergence calculation for broad classes of Gaussian measures.

2. Mathematical Formulation

Let $H$ be a real separable Hilbert space, and let $\mu_0 = N(m_0, C_0)$ , $\mu_1 = N(m_1, C_1)$ be Gaussian measures on $H$ . The geometric mean (interpolation) of their parameters is defined as:

$C_{\alpha,\gamma} = \left[ (1-\alpha)(C_0+\gamma I)^{-1} + \alpha (C_1+\gamma I)^{-1} \right]^{-1}$
$m_{\alpha,\gamma} = C_{\alpha,\gamma} \left[ (1-\alpha)(C_0 + \gamma I)^{-1} m_0 + \alpha (C_1 + \gamma I)^{-1} m_1 \right]$

Here, $\alpha \in [0,1]$ and $\gamma > 0$ is the regularization parameter.

The regularized geometric Jensen-Shannon divergence is then defined as: $\begin{aligned} JS_{G_\alpha}^\gamma (\mu_0 \| \mu_1) =&\, \frac{1-\alpha}{2} \left\|C_{\alpha, \gamma}^{-1/2}\left(m_0-m_{\alpha, \gamma}\right)\right\|^2 \ &+\frac{\alpha}{2} \left\|C_{\alpha, \gamma}^{-1/2}\left(m_1-m_{\alpha, \gamma}\right)\right\|^2 \ &+\frac{1-\alpha}{2} d^1_{\log\det}\left(C_0 + \gamma I, C_{\alpha,\gamma}\right) \ &+\frac{\alpha}{2} d^1_{\log\det}\left(C_1 + \gamma I, C_{\alpha,\gamma}\right) \end{aligned}$ where $d^1_{\log\det}(A, B)$ is the (regularized) log-determinant divergence between positive definite unitized trace-class operators, defined as: $d^1_{\log\det}(A, B) = \operatorname{tr}_X(B^{-1}A - I) - \log\det_X(B^{-1}A)$ with $\operatorname{tr}_X$ the extended trace and $\log\det_X$ the infinite-dimensional determinant.

As $\gamma \rightarrow 0^+$ , provided $\mu_0$ and $\mu_1$ are equivalent, this recovers the unregularized geometric Jensen-Shannon divergence.

In finite dimensions, this reduces to an explicit formula involving standard means, traces, and determinant terms: $JS_{G_\alpha}(\mu_0 \| \mu_1) = \frac{1-\alpha}{2} \| C_\alpha^{-1/2}(m_0 - m_\alpha) \|^2 + \frac{\alpha}{2} \| C_\alpha^{-1/2}(m_1 - m_\alpha) \|^2 - \frac{1}{2} \log \frac{\det(C_0)^{1-\alpha}\det(C_1)^{\alpha}}{\det(C_\alpha)} + \frac{1}{2} \operatorname{tr}\left( C_\alpha^{-1}[(1-\alpha)C_0 + \alpha C_1] - I \right)$ with $C_\alpha = [(1-\alpha)C_0^{-1} + \alpha C_1^{-1}]^{-1}$ , $m_\alpha$ analogous.

3. Role of Regularization and Infinite-Dimensional Behavior

In infinite-dimensional Hilbert spaces, absolute continuity (equivalence) of Gaussian measures is rare, so many standard divergences diverge or lack meaning. The RGJSD addresses this by regularizing each covariance with $\gamma I$ , ensuring all linear operators involved are of the correct class and well-defined. The regularization parameter $\gamma$ thus:

Ensures the divergence is finite for all pairs of Gaussian measures,
Interpolates between a fully regularized (robust, smoothed) divergence at finite $\gamma$ and the exact geometric JS divergence as $\gamma \rightarrow 0^+$ for equivalent measures,
Controls stability and sensitivity in practical or numerical applications.

The log-determinant terms employ extended (unitized) trace-class and Fredholm determinant theory, crucial for extending entropy-like divergences to the operator setting.

4. Applications and Theoretical Implications

RGJSD is widely applicable in problems involving Gaussian processes, random fields, and functional data:

Functional data analysis: Comparing (possibly infinite-dimensional) stochastic processes (e.g., comparing covariance structure of time series, spatial data, medical imaging signals).
Quantum information: Distance between quantum states (density operators) via symmetric, regularized, and metrically robust information geometry.
Machine learning: Kernel-based generative models and data comparison in high or infinite-dimensional feature spaces.
Bayesian inverse problems: Quantifying the information gain between priors and posteriors as Gaussian measures in infinite-dimensional settings.

The regularized nature of the divergence provides robustness against degeneracy and instability, with the guarantee that as regularization vanishes, classical finite-dimensional results are recovered.

RGJSD is part of a family of geometric Jensen-Shannon divergences, which generalize JS by:

Replacing the arithmetic mean in mixture distributions with a geometric mean, particularly well-suited for exponential family or Gaussian settings ((1904.04017); (2006.10599); (2506.10494)).
Allowing for closed-form expressions even when classical JS divergence has none, especially for Gaussians where the arithmetic mixture is not Gaussian but the geometric mixture is.

RGJSD differs from standard geometric JS or Jensen-Tsallis divergences by being explicitly regularized at the operator level, ensuring universal applicability regardless of equivalence or singularity.

6. Properties and Metric Structure

Symmetry: Like all Jensen-Shannon-type divergences, RGJSD is symmetric in its arguments.
Boundedness and Finiteness: Regularization ensures RGJSD is always finite and well-posed.
Metric property: For its square root, RGJSD inherits the metric property on positive definite operators under suitable analytic conditions, extending results known for classical and quantum JS divergences (1911.02643).
Infinite-dimensional generalization: RGJSD robustly extends divergence-based geometry to infinite dimensions, essential for modern applications in stochastic modeling.

7. Summary Table: Principal Formulas

Aspect	Regularized Geometric JS Divergence
General form	$\begin{aligned} \frac{1-\alpha}{2} KL(\mu_0 \\| \mu_{\alpha}) + \frac{\alpha}{2} KL(\mu_1 \\| \mu_{\alpha}) \end{aligned}$
Regularized version (Hilbert)	See full expression above; involves $d^1_{\log\det}$
Limiting case ( $\gamma \to 0$ )	Recovers classic geometric JS for equivalent measures
Symmetry	Yes
Metric property (√divergence)	Yes, under analytic conditions via operator theory

8. Conclusion

The Regularized Geometric Jensen-Shannon Divergence is a robust, symmetric, and theoretically grounded generalization of the Jensen-Shannon divergence, designed for comparison of Gaussian measures on both finite and infinite-dimensional spaces. It overcomes the inapplicability of classical density-based divergences in infinite dimensions by geometrically interpolating via the operator inverse and regularizing the underlying covariance structure, yielding a finite, practical, and metrically valid formula that reduces to the classical case when possible. This construction has significant implications for geometric information theory, quantum information, functional data analysis, and operator-based kernel learning.

PDF Markdown Chat (Upgrade)

References (4)

On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means (2019)

Constraining Variational Inference with Geometric Jensen-Shannon Divergence (2020)

Geometric Jensen-Shannon Divergence Between Gaussian Measures On Hilbert Space (2025)

Metrics Induced by Jensen-Shannon and Related Divergences on Positive Definite Matrices (2019)