The Burbea-Rao and Bhattacharyya centroids (1004.5049v3)

Published 28 Apr 2010 in cs.IT, cs.CG, and math.IT

Abstract: We study the centroid with respect to the class of information-theoretic Burbea-Rao divergences that generalize the celebrated Jensen-Shannon divergence by measuring the non-negative Jensen difference induced by a strictly convex and differentiable function. Although those Burbea-Rao divergences are symmetric by construction, they are not metric since they fail to satisfy the triangle inequality. We first explain how a particular symmetrization of Bregman divergences called Jensen-Bregman distances yields exactly those Burbea-Rao divergences. We then proceed by defining skew Burbea-Rao divergences, and show that skew Burbea-Rao divergences amount in limit cases to compute Bregman divergences. We then prove that Burbea-Rao centroids are unique, and can be arbitrarily finely approximated by a generic iterative concave-convex optimization algorithm with guaranteed convergence property. In the second part of the paper, we consider the Bhattacharyya distance that is commonly used to measure overlapping degree of probability distributions. We show that Bhattacharyya distances on members of the same statistical exponential family amount to calculate a Burbea-Rao divergence in disguise. Thus we get an efficient algorithm for computing the Bhattacharyya centroid of a set of parametric distributions belonging to the same exponential families, improving over former specialized methods found in the literature that were limited to univariate or "diagonal" multivariate Gaussians. To illustrate the performance of our Bhattacharyya/Burbea-Rao centroid algorithm, we present experimental performance results for $k$-means and hierarchical clustering methods of Gaussian mixture models.

Citations (164)

View on Semantic Scholar

Summary

The paper introduces iterative concave-convex optimization to compute unique Burbea-Rao centroids with guaranteed convergence.
The method reinterprets the Bhattacharyya distance for Gaussian mixtures, offering a computational advantage in centroid calculations.
Practical applications include improved clustering performance in k-means and hierarchical models for tasks like image segmentation.

The Burbea-Rao and Bhattacharyya Centroids

This paper presents a comprehensive paper on the Burbea-Rao and Bhattacharyya centroids by exploring their theoretical foundations and practical applications. The Burbea-Rao divergence, a class of information-theoretic measures, extends the well-known Jensen-Shannon divergence through Jensen differences, providing a symmetric, yet non-metric, method of measuring dissimilarities. The paper introduces crucial theoretical advancements by demonstrating that Burbea-Rao centroids are not only unique but also computable using iterative concave-convex optimization algorithms with guaranteed convergence properties.

Divergences and Centroids: Theoretical Understanding

The Burbea-Rao divergences are derived by a symmetrization of Bregman divergences, specifically through the use of Jensen-Bregman distances. The paper extends this framework by defining skew Burbea-Rao divergences and proves their propensity to converge to Bregman divergences under limiting conditions. Importantly, Burbea-Rao centroids, despite their non-metric basis, have shown to be unique, presenting a significant theoretical insight and demonstrating a stark divergence from classical metric-based centroid calculations.

The authors also delve into the Bhattacharyya distance, commonly employed for assessing the overlap between probability distributions. They establish that, for distributions within the same exponential family, calculating the Bhattacharyya centroid can be re-envisioned as computing a Burbea-Rao divergence, thus offering a direct computational advantage over traditional methods.

Applications and Implications

The practical implications of Burbea-Rao and Bhattacharyya centroids are explored through statistical applications, particularly in clustering methods such as $k$ -means and hierarchical clustering of Gaussian mixture models. The paper includes empirical results demonstrating the efficiency of the proposed centroid computation methods, particularly in simplifying Gaussian mixture models for tasks like image segmentation.

Further exploration is made into approximating Bhattacharyya centroids in multivariate spaces, particularly for Gaussian distributions, using matrix differentials. This methodological innovation provides a tailored approach that complements the generic Burbea-Rao method, showing superior performance in empirical tests.

Future Directions and Conclusions

Looking ahead, the paper's implications in the field of information geometry are profound. The work posits potential extensions of Burbea-Rao divergences as tools for geometric and statistical application in AI, hinting at unexplored domains within information-theoretic measure-based clustering and optimization.

The conclusive outcomes highlight that the Burbea-Rao class of divergences offers a robust framework for centroid computation beyond traditional Euclidean geometry, enabling enhanced accuracy and efficiency in complex statistical models. The paper is a testament to the intersection between theoretical elegance and practical effectiveness, opening pathways for future research and applications in the field of AI and information theory.

PDF Markdown

Related Papers

YouTube

Show All Videos