Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means (1904.04017v5)

Published 8 Apr 2019 in cs.IT, cs.LG, and math.IT

Abstract: The Jensen-Shannon divergence is a renown bounded symmetrization of the unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler divergence to the average mixture distribution. However the Jensen-Shannon divergence between Gaussian distributions is not available in closed-form. To bypass this problem, we present a generalization of the Jensen-Shannon (JS) divergence using abstract means which yields closed-form expressions when the mean is chosen according to the parametric family of distributions. More generally, we define the JS-symmetrizations of any distance using generalized statistical mixtures derived from abstract means. In particular, we first show that the geometric mean is well-suited for exponential families, and report two closed-form formula for (i) the geometric Jensen-Shannon divergence between probability densities of the same exponential family, and (ii) the geometric JS-symmetrization of the reverse Kullback-Leibler divergence. As a second illustrating example, we show that the harmonic mean is well-suited for the scale Cauchy distributions, and report a closed-form formula for the harmonic Jensen-Shannon divergence between scale Cauchy distributions. We also define generalized Jensen-Shannon divergences between matrices (e.g., quantum Jensen-Shannon divergences) and consider clustering with respect to these novel Jensen-Shannon divergences.

Citations (161)

Summary

  • The paper proposes a generalized framework for the Jensen-Shannon divergence that uses abstract means to extend its applicability beyond standard distributions.
  • It demonstrates closed-form expressions for geometric and harmonic divergence instances, particularly for exponential and Cauchy scale families.
  • The research explores clustering applications by leveraging the new divergences while preserving boundedness and information monotonicity.

Generalization of the Jensen-Shannon Divergence with Abstract Means

The paper presented offers a comprehensive investigation into a generalization of the Jensen-Shannon (JS) divergence, a well-known bounded symmetrization of the Kullback-Leibler (KL) divergence. The research addresses the limitations of the JS divergence, particularly its absence of closed-form expressions for certain distributions like Gaussian, and extends the concept using abstract means. This work encompasses the definition, methodology, and application of generalized JS divergences, alongside their theoretical underpinnings.

Key Contributions

  1. Generalization of JS Divergence:
    • The paper introduces a framework for generalizing the JS divergence by incorporating abstract means, expanding its applicability to cases where closed-form solutions were previously unavailable.
    • It defines JS-symmetrizations for any distance using generalized statistical mixtures, which are derived from abstract means.
  2. Specific Instances and Closed-form Solutions:
    • The research demonstrates specific instances of this generalization, particularly for exponential and Cauchy scale families. It showcases closed-form expressions for:
  3. Application to Clustering:
    • The paper explores clustering applications with respect to these novel JS divergences. This includes a discussion of centroid calculations in information geometry using MM-JS divergences for mixture and exponential families.
  4. Theoretical Implications:
    • The introduction of abstract means allows the divergence generalization to preserve key properties like boundedness. The conditions under which these divergences remain bounded are discussed, linking back to the dominance relationships among means.
    • The work also elaborates on the implications for the information monotonicity inherent in these divergences within information geometry.

Theoretical and Practical Implications

  • Theoretical Impact: This generalization extends theoretical insights into divergence measures, linking statistical and parametric divergences and exploring their geometric interpretations in information theory.
  • Practical Applications: It provides practical solutions to problems that involve calculating divergences in fields such as machine learning and statistical analysis, where traditional methods may fail due to lack of closed-form solutions.
  • Future Directions: The research opens avenues for further exploration in AI and machine learning, especially in probabilistic modeling and clustering, by leveraging generalized JS divergences in complex, high-dimensional data contexts.

In conclusion, the paper advances our understanding of divergence measures by framing a more versatile model that can accommodate a wider range of statistical distributions. This work is significant for researchers focusing on divergence theory, statistical distributions, and their applications in data science and AI.